Not everything needs blazing-fast performance, but some things do.
Partitioning could well be one of the most important and pervasive trends since the invention of computers. It has been around for almost as long, too.
The idea dates back at least as far back as the Manhattan Project during World War II, when computations were wrapped within computations. It continued from there with what we know as time-sharing, which rather crudely partitioned access by people to mainframes. In the early days of computing, the cost of a computer was so high that universities typically could afford only one, so students and professors needed to sign up for any available time slots, which ran 24 hours a day.
The display terminal and parallel processing changed all of that in the 1960s and 1970s, followed by the PC in the 1980s. As basic programs were parallelized and/or virtualized, they could be partitioned across groups of users, rather than individuals. The cloud has pushed this to its technological apex, whereby workloads can be scaled up or down virtually. (It’s not clear if quantum computing will roll back the clock to the original time-sharing approach that accompanied early mainframes.)
Still, in the intervening years between early computing and today’s pervasive computing model, computing became cheap enough and pervasive enough that partitioning by user was no longer relevant. But those computers, whatever they are, do need to prioritize compute jobs. So a smart phone may allow you to access e-mail and even some multi-tasking, but when the phone rings it will block everything else. While an early mainframe could do one thing relatively well, existing devices can do many things well enough, and often simultaneously. If an application can be parsed across 10 or 20 cores, then it will run faster than if it is a single- or dual-threaded application on that same system.
This is particularly important for AI/ML applications, where the more processing elements that can be utilized, the more precision possible (assuming, of course, the training algorithm was correctly weighted and constructed), and the faster the results. So rather than a 60% probability that an answer is correct, it can be pushed to 95% probability in the same or less time.
As devices become more heterogeneous, partitioning is undergoing another shift. For years, it has been well understood that tightly integrating hardware and software could have a giant payoff in terms of both performance and lower power. The problem was that software and hardware engineering teams were generally isolated, in part because that speeded up time to market and in part because they basically spoke different languages. But as these two worlds begin to merge for AI/ML/DL, it’s now possible to partition computing into much more specialized processors and accelerators, different segments of memories, and even different parts of a software application.
Add to that technologies such as virtualization/containerization and multiple chips connected over high-speed interfaces, and suddenly systems start looking like giant partitioning projects. In this case, however, partitioning can be used to prioritize dynamically. So rather than one function taking control, prioritization may depend on a variety of factors. A tire sensor in a car may take precedence if the tire blows out, or it may be interrupted if a car is about to hit an object.
In effect, the command center is able to gather up resources as needed, just the way that’s being done today in a hyperscale data center, and make partitioning decisions on the fly. Alternatively, the compute resources can be exercised evenly through load-balancing, or adjusted accordingly depending upon which ones are showing signs of aging or performance degradation.
The big change is just how granular this can become, and how that granularity can be used to improve performance as needed in real-time. This is still time sharing at its roots, but it has taken on a whole new dimension.
Related
Addressing Pain Points In Chip Design
Partitioning, debug and first-pass working silicon lead the list of problems that need to be solved.
Partitioning In 3D
Interconnects, bonding and the flow of data in advanced packaging.
Partitioning Drives Architectural Considerations
Experts at the Table, part 2: Biggest tradeoffs in partitioning.
Partitioning Drives Architectural Considerations
Experts at the Table, part 1: When and how do chip architects prioritize partitioning?
Leave a Reply