Taking aim at AI bottlenecks based on NoC technology developed by NetSpeed Systems.
Moving data is one of the big challenges in the AI world. There is so much data being generated that even moving it back and forth from processors to memories requires a significant amount of power, enormous bandwidth, and frequently causes delays that can bog down performance. Now, with substantially more processing, different types of processors on each system on chip (SoC), and the emerging chiplet paradigm, the problem is growing exponentially.
It should come as no surprise that this is where investors are placing their bets on a whole range of startups with innovative ideas. Faster and more efficient data movement is worth big bucks today, and if these systems can be assembled more quickly than today’s one-off designs using standardized parts, methodologies, and tools, then the entire semiconductor industry will benefit, regardless of whether it’s for designs on the edge or in the data center.
“The common theme we’re seeing is that data movement is becoming a critical issue,” said Nandan Nayampally, chief commercial officer at startup Baya Systems. “The key challenge with AI acceleration is that it’s not a standard one-size-fits-all, either a CPU or a GPU solution. Neural processing units (NPUs) and other specialized accelerators are being integrated into larger SoCs, or even into systems of chiplets. Costs of development are skyrocketing, the efficiency demands are increasing, and there is a need for ‘guaranteed’ performance on ever-changing workloads. How do you analyze that? How do you build systems that are most efficient for the next generation and keep them future-proofed? That is a growing problem for semiconductor and systems vendors.”
Where Baya Systems is focused today is on developing a unified network-on-chip fabric with a common transport layer. “You can design it such that you can combine a coherent sub-system and a non-coherent sub-system over a unified transport layer,” Nayampally said. “You can re-use wire or logic wherever possible based on your performance/cost constraints. You can always do very compartmentalized sub-NoCs, as well. But some of the benefits of the unified NoC are seen in the high-performance scalable systems.”
This is effectively the third generation of technology originally developed by NetSpeed Systems, which Intel acquired in 2018. In fact, Baya Systems was founded by Sailesh Kumar, who co-founded NetSpeed with Sundari Mitra.
“Our NoC is topology agnostic,” said Nayampally. “There’s a lot of flexibility in how you build it, and the transport architecture is quite different. It’s what allows us to overlay different protocols on it, from the ground up. You can partition it and build it in a modular tile-able level for scale. It’s physically aware, thereby making it much easier to implement, faster to deploy, reducing risk, cost, and accelerating time-to-market. Baya has built a data-driven software and IP platform that enables in-depth architectural analysis, which a designer needs to do from an SoC perspective, partitioning if it is a system of chiplets to enable designers build an advanced system with best-in-class fabric to deliver on the performance and efficiency needs that the customers’ workloads demand.”
Baya Systems emerged from stealth mode less than a year ago. The company announced it had raised more than $36 million in a Series B round led by Maverick Silicon, backed by a strategic investment from Synopsys, with current investors including Matrix Partners and Intel Capital.
“Originally, NoCs were perceived to be a small IP market, because most of the big semiconductor houses built them in-house — and still do,” Nayampally said. “You need specialized solutions for this. But it’s very difficult to build best-in-class expertise in-house for every company. The market for chiplet-ready fabric IP, and more importantly the related software tooling, is expected to grow to over $4 billion, by the end of this decade. And then beyond that, as you look at enabling advanced chiplet-based design for merchant chiplets for things like I/O hubs, networking and processor chiplets, these could reach $100 billion in 10 years.”
Leave a Reply