Changes (For The Better) In Muticore Technology

New SDKs are power aware and more sophisticated in dealing with power states; parallelization of software still lags.


By Pallab Chatterjee
Subtle but important changes are occurring in the multicore world, particularly in the power per function that is available from each core.

Two trends are increasingly evident. First, there is a growing need for higher-bandwidth data transfer or multiple-function processing for existing data transfer levels. And second, there is a shift to low power or mobile implementations for existing systems. But it’s important to remember that even though multicore designs may be able to address the higher bandwidths for data transferred between computers, they do so with a power penalty.

Some of the lower power solutions are relatively easy to implement, such as the selection of DDR3 memory in the systems vs. DDR2. The lower-power format and the availability of the IP cores in most of the FPGA products and standard drivers make the transition quite straightforward. The DDR3 transition has taken awhile because many chipmakers were waiting for costs to come down. Another change in memory involves the interface moving to SDXC memory from multiple SDHC external memory controls. The reduction of the duplicate channels and their associated power use by moving to a single higher-capacity device and storage location can have a major impact (double digit percentages) for some mobile systems.

To support the higher data bandwidths and also maintain signal integrity, new Software Development Kits (SDKs) are being introduced to most of the core processor products. A new feature, pretty much across the board, is the inclusion of power awareness to the SDKs.

Historically, these kits have been focused only on functionality of the application. As the applications now require details such as, in a multicore environment, whether the core the next data set is going to “on” or “off” due to the power-saving controls, the SDKs need to be aware of the power handling. Some of the SDKs are so sophisticated that they integrate the block power state control as part of the application development. Others have a multi-part solution that requires a power status control code block, and the application code creates a state file to be used in the development of the power control. In the FPGA applications, the power control in the SDKs also could control the display and standard I/O control power state function.

A feature still lacking in the SDKs is the automatic partitioning for the cores and parallelization of the application code. In applications that are using two to eight internal cores, say for a network control application, the coding is still handled in two pieces: (A) the application code per core and (B) the data flow partitioning code which directs data to the cores when ready. This is fine for applications that are inherently parallel, but for most algorithms, designers using multicore applications still end up with multiple single applications running on each core rather than utilizing the parallel bandwidth of the architecture to split a task between multiple cores.

The next major challenge seems to be addressed in at least early stages with these new SDKs: In multicore architectures, memory thrashing is an issue with each core on a common bus paging or flushing the data in memory each cycle. A number of the multicore SDKs include memory management to make the load, selection and clearing of the memory less I/O intensive and more conducive to high functional throughput rather than the congestion and contention issues that dominates current generation designs. These new tools are a major step toward bringing about mainstream multicore designs.