Blurring The Lines At The OS Level

Choosing operating systems is getting more difficult as the definitions for general-purpose OSes, RTOSes and executable code become fuzzier.

popularity

By Ed Sperling
Picking an operating system—or choosing not to use an operating system—is becoming as complex a decision as choosing which IP to use in an SoC. Even decisions that sound straightforward may have ramifications on the total system power budget or performance, requiring them to be an integral part of the overall architectural process.

But the choice of operating systems, as well as the choices of how to use those OSes, is looking more like a spectrum of possibilities and tradeoffs than a simple menu. The options now include everything from hibernate modes in RTOSes and low-power modes in general-purpose OSes to a combination general-purpose OSes, RTOSes and runtime code in processor clusters.

Factors to consider
One of the most important decisions in all of this is business-related. How will the design—and the software—be used in future iterations of a chip? Due to skyrocketing costs and shortened market windows, most complex SoCs need to be created with derivatives and engineering change orders in mind. That generally means a mix of general-purposes OSes, real-time operating systems (RTOSes), as well as executable code. While the choice of an OS is often market-related, the choice of an RTOS is hardware-related and the executable code is both task- and hardware-specific.

That sounds straightforward enough, except that the lines are blurring between all of these. When RTOSes were originally created to handle very specific tasks for military hardware and industrial applications such as thermostats, the focus was on reliability and consistency. The key was how long it took to complete a scheduled task. More recently, the focus has shifted to how much power it takes to perform that task, what can be leveraged from one design to the next, and how much time it takes to get a complex design out the door.

“What an RTOS provides is determinism,” said Gene Matter, senior applications manager at Docea Power (and a former Intel architect). “That allows you to optimize the heck out of it for temperature and performance.”

Determinism is a key concept in the RTOS world. Being able to program software that will behave the same way every time with guaranteed results, particularly with a well-defined footprint, is required for mission-critical or SoC-critical functionality. An RTOS may only do one thing or it may do multiple things, but it has to deliver those results without fail in a fixed amount of time, in applications ranging from missile guidance systems to pacemakers to automobile engine functions.

At the other extreme are general-purpose operating systems, which provide a set of standard application programming interfaces (APIs) for applications developers. The general-purpose OS—which can include anything from Microsoft Windows to Google Android and Apple iOS, all the way up to IBM’s mainframe OSes and various implementations of Linux—makes all the necessary connections to hardware and the outside world that the application could ever need.

General purpose OSes are extremely portable and flexible, but they aren’t particularly energy-efficient, and that inefficiency greatly increases the amount of power being expended for a specific operation. They also include so much code and require so many updates that errors and an ever-bloating footprint are inevitable. If portability and flexibility also are important, then the classic idea of an RTOS is too confining.

But increasingly, RTOSes are exhibiting some of the same portability as general-purpose OSes. They’re being bundled with tools and drivers to allow some degree of portability and flexibility. Likewise, general-purpose OSes are being shrunk to fit into smaller footprints with options for turning on and off specific functionality.

Making choices
So with power and performance now so tightly coupled together on a densely packed piece of silicon, the question becomes what to use when and where. This is a big challenge for design teams, and there is no simple formula here. Instead of discrete choices, there are tradeoffs that need to be analyzed for multiple scenarios—a mind-numbing number of choices, in fact, that will only increase at each new process node and in stacked-die configurations.

Dan Driscoll, Nucleus RTOS architect at Mentor Graphics, said a general-purpose OS offers enormous benefits because it is widely supported and open for development, especially in the multimedia space. RTOSes traditionally have been more limited, but Mentor’s approach has been something of a hybrid—a general-purpose RTOS that includes a thin abstraction layer to allow the kernel to work on multiple hardware platforms.

“Our strategy has been to keep as much software as generic as possible for scalability reasons,” said Driscoll. “The old kernel was written in assembly code and it didn’t scale particularly well. We’ve been able to increase performance by moving it to C, while still keeping a focus on performance and footprint. This also used to be sold a la carte. Now there’s a complete package of tools, debugger and source code.”

That has made it particularly popular in a new market—medical electronics, according to Driscoll. “We’re also seeing interest in places like smart grid design, where there is a push to add more intelligence. But the big thing is still hardware support. Companies choose hardware before software, and they want tools—especially the bigger companies.”

Still, not all choices are RTOS vs. general-purpose OS. Steve Roddy, vice president of marketing at Tensilica, said the choice is sometimes no OS at all.

“In a baseband modem there are multiple cores in the PHY layer,” Roddy said. “The upper cores, or the bigger cores, may be running an RTOS. But there also may be specialty cores that use single-threaded runtime code. So you may have five cores running single-threaded code while one runs the OS and does the task switching.”

Roddy said that RTOSes such as Express Logic’s ThreadX and Mentor’s Nucleus are still comprehensible by a single software developer because they have 10,000 to 40,000 lines of code, compared with hundreds of thousands or millions of lines of code in general-purpose OSes.

“If you have a defined team all in-house and you know the entire universe of code that runs on a printer or a home gateway, that’s a tractable problem. You already have programmers who are well aware of the hardware realities of the chip and the box that RTOS fits in. The main attributes you’re looking for are the overall power envelope, the interrupt response time and the footprint.,” he said.

What defines success
So what makes one RTOS better than another? The answer isn’t so clear anymore, in large part because of the necessity of re-using large portions of a chip in future chips and derivatives. That explains why there are so many choices—and why the lines are getting so fuzzy.

“At the base level you need fault tolerance and resilience,” said Docea’s Matter. “You need a guarantee of deterministic behavior. But a lot of determinism is in the developer’s hands. These days when you look at Wind River there are a lot more facilities that you used to see in Linux and Unix in the RTOS. And then with Linux, there are new versions with a smaller footprint and layers to try to provide portability so drivers can be used for other hardware.”

That point has become particularly important for software teams that aren’t used to developing drivers.

“The key is bringing together models and levels of abstraction so you can figure out what the tradeoffs are,” said Matter. “You need to run the real-time application behavior on top of the underlying hardware.”

That means taking into account performance, power, footprint, level of reliability, and whether it will be working alone or with other software, and whether it will be used as a black-box subsystem or inside a chip that will see multiple derivatives and potentially different processors or cores. Defining success in those situations isn’t easy—and that definition may change over time.

For design teams, this is good news and bad news. The good news is there are now options available in software for just about every possible configuration and use model. The bad news is they have to sift through lots of different options and make some hard choices.