Embedded Software: Sometimes Easier, Often More Complex

Dependencies and partitioning can turn a simple piece of code into a complex system challenge.


Embedded software, once a challenge to write, update, and optimize, is following the route of other types of software. It is abstracted, simpler to use, and much faster to write. But in some cases, it’s also much harder to get right.

From a conceptual level, the general definition of embedded software has not changed much. It’s still low-level drivers and RTOSes that run close to the hardware, deterministic in nature, and time- and resource-critical. But the ecosystem around that software and the methodologies used to create it have changed significantly.

“End application developers have so many tools and API supports available from the silicon vendors and third parties,” said Kamal Karthikeyan, senior manager for product marketing at Renesas Electronics. “This helps the developers to integrate their application pretty fast and they don’t have to worry about any low-level hardware details. They can just use the tool interfaces to configure the hardware they intend to use, and employ software APIs to write their applications. This helps reduce development efforts considerably and achieve faster time to market. This means the face of embedded software is now the code configurators and APIs.”

Fig. 1: Making tradeoffs at a higher abstraction level. Source: Renesas

Developers benefit from the abstraction of the hardware knowledge from the development to the point where they don’t even have to use any real hardware for the development. Most of that can be done with virtual hardware platforms, and the resulting embedded applications can be much lighter than in the past.

“The low-level architecture of the embedded software is still following the same line of flow in initializing the hardware-specific modules and bringing up the application entry,” said Karthikeyan. This code will be developed and provided by the silicon vendors, so the end user need not be worried about this. Additionally, most of the low-level embedded code is still using the Assembly or C programming languages.”

Fig. 2: Embedded software enables the various components in a system in very specific ways, but the more complex the system the more dependencies need to be accounted for.

So while embedded software is changing at the end user applications, the underlying embedded software developed by the silicon companies has not changed that much.

“With the growing complexity of software and design for reuse across platforms, standard additional abstraction layers are useful for developers,” said Simon Davidmann, CEO of Imperas Software. “Examples include CMSIS (Common Microcontroller Software Interface Standard) from Arm, and Freedom Metal from SiFive. Abstractions are great and a useful approach, but challenges can be in the analysis and having a correct view of the context for the tasks, such as OS drivers for the first OS boot on a new platform, firmware, or application development. Specific verification, analysis, and profiling tools extend the traditional debug solution to cover the important context of abstractions. Any processor that boots an OS or RTOS can have a large number of processes and software threads, so the developers need to select the context to see from low-level events, hardware interactions, and instruction streams right up through the OS function calls to the application layer. For embedded developers the details matter, so you need the right tools that can cover abstractions well.”

Virtual platforms based on instruction-accurate models of the processor ISA allow introspection across the interactions of the processor to the peripherals, or in most multi-core, multi-threaded, or heterogenous processors, the interactions between cores — especially around asynchronous events.

“With certain software simulation and analysis tools, the analysis tools are an abstraction below the virtual platforms, which does not require any modification to the OS or software,” Davidmann explained. “This works very well in high-reliability or security applications, as the production binary is used without modification and does not know it’s not running on physical hardware. Abstractions are a useful approach, but developers need tools that allow analysis within context. Such tools allow users to work and debug at the different abstraction levels, including the hardware abstraction layers through to full OS-aware tools. The key to efficient software development is to work at the appropriate level of abstraction – and software solutions using virtual prototypes enables that.”

This has significant implications for the engineering teams trying to decide on the architecture for their chip and system.

Johannes Stahl, senior director of product marketing at Synopsys, noted that in one example, the company that showed it could execute a particular edge application in the best possible way won the deal. “How do they do that? They get the application from the end customer. They run all the applications prior to tape-out on emulation to figure out that their architecture is actually supporting the software in the best possible way, and consumes the least amount of power. You have to do this pre-silicon because the risk is too high if you don’t. Also, it must be done pre-silicon because if you’re a startup company and you want to win a socket, you cannot wait until you have silicon. You have to do this engagement prior to that — and then, of course, follow up and show there is working silicon, so now they can touch it.”

The other related piece to this is functionally validating the software that runs on a new architecture, Stahl said. “That’s important for a tape-out, but it’s no longer sufficient. All customers in the space today are adding two more aspects to that. One is measuring the performance, as in how many cycles are getting through the architecture using this software workload. The second is power. Measuring both performance and power along with functional correctness is critical, and only this triplet of functionality, performance, and power gives them the criteria to basically go to tape out and say, ‘I have a viable architecture that’s actually matching what my software application wants.'”

Emulation and virtual prototyping are especially critical as chips become more diversified, with an increasingly number of subsystems. Some of these are meant for more application-level software, while other parts of the chip use real-time software. But even though this is an embedded device, it may have the same processing power as a desktop device or a server device on part of the chip, in addition to a real-time component.

Dan Driscoll, senior engineer at Siemens Embedded Solutions, noted that as complexity rises, so does the need for embedded engineering skill sets. “They are still required, even though there’s a lot more processing going on that isn’t falling into the deeply embedded space. There are enough domains that still do require those skills on the engineering side.”

Abstraction helps

Utilizing those skills plus a higher level of abstraction, it’s possible to get much more complex software running on chips in the ’embedded space’ than in the past.

“With new architectures becoming so much more complex, with so many more processing elements, deeper pipelines, three or four levels of cache, and multiple subsystems that have the ability to interact, it requires a much more in-depth understanding of the overall system,” said Driscoll. “It’s a matter of where to put what, and how to utilize the overall hardware get the most efficiency. It’s the same problem that used to exist in the deeply embedded space, but now it’s more complex than it used to be. Expertise is still very much needed to get the most out of these chips, and that’s what it comes down to. It’s how to optimize the use of these complex chips most efficiently, especially if there are hard, real-time requirements, or deadlines that require a real-time operating system or the like.”

In many systems today, all of this needs to work together. Functionality needs to be optimized, but it also needs to be part of an overall architectural partitioning.

“With zonal architectures in vehicles, for example, you can do the partitioning on microcontrollers or SoCs to say which scenario is being able to use this SoC and have no hard real-time or even safety-certified code running in a part of an SoC,” said Scot Morrison, general manager at Siemens Embedded Solutions. “Then you may have a general-purpose operating system — even Linux with much less stringent or maybe even relatively loose requirements on it — for a best-effort type execution running on the same chip, with items that just absolutely have to run. This makes it more of a system problem when you start talking about designing features that could run in a variety of different places in a vehicle, for example, and you don’t even know where they’re going to be running in the vehicle at the time that you’re doing the high-level design. The issue then becomes a design problem of system architecture.”

So while real-time programming itself is getting easier, it can add constraints at the application level. “Typically, the embedded system will work real time, and will have to do things real time, meaning that the latency is very important,” said Pierre-Xavier Thomas, group director of technical and strategic marketing for Tensilica IP at Cadence. “There needs to be a defined response time for the device, as opposed to something like a PC that is capable of doing everything or anything or any application. But you don’t necessarily know if it’s always going to do it in the same amount of time. In an embedded system, the time spent, and what the embedded system is doing with respect to the real-time system aspect, is very important. A lot of the challenges are about how to create the architecture so that you are able to do all the functionality you need in real time, and with the right architecture for the energy efficiency that you’re targeting.”

That can help determine what gets done in hardware and what gets done in software. “Hardware typically has the tendency to be more energy-efficient for a dedicated task, but less flexible,” Thomas said. “It also potentially requires a bigger area. If you have functionality that you want to add, and then you do it in hardware, because the hardware is less flexible, you’re going to need more blocks on your silicon to do that. That comes with higher leakage power, and also a higher price for your die. These are just some of the tradeoffs engineering teams are facing in order to realize their system.”

Making tradeoffs
Still, at a certain point in development, inevitably the engineering team reaches the point of a tradeoff about whether to rewrite something that needs to be customized and optimized. This is particularly true when it involves optimization for performance and power.

“That’s where you can move the needle around,” said Rupert Baines, chief marketing officer at Codasip. “If you use a standard processor off the shelf that will do 80% of the job for 80% of the people 80% of the time, it might well be sufficient. When you start getting into projects with bigger budgets, more ambition, more complicated tasks, 80% stops being good enough. At that point you decide you must do something specific and not use a standard library. ‘We’re going to write some code ourselves to do something specific.’”

As with most designs, applications and use cases matter. “The next step would be, ‘We’re not going to use a standard processor,” Baines said. “We’re going to modify the hardware, and we’re going to do hardware-software co-integration.’ That’s in the high-end world, and that’s what Apple has done for many years with smartphones. It’s what they’ve done for the last couple of years with the M1 series, and they’ve designed their processor, the actual code, in order to suit the application that they’re running. The GPU, the neural network, the OS accelerators are all tightly coupled between software coming down, and the hardware coming up. We see this quite a lot in the embedded space. There are a lot of applications around co-processors, accelerators, domain-specific hardware, heterogenous compute, where there is a very well-understood approach. It’s been difficult to implement because you haven’t necessarily had the ability to make changes to the processor. But things like RISC-V make that more and more possible.”

Big Changes In Embedded Software
Integration and re-use are shifting the focus from minimal footprint to reusability and flexibility.
Hidden Impacts Of Software Updates
Over-the-air code changes can stress systems in unexpected ways.
Software-Defined Cars
This approach will streamline development and simplify upgrades, but it also increases design complexity.

Leave a Reply

(Note: This name will be displayed publicly)