Firmware Skills Shortage

Adding intelligence into devices requires a different skill set, and finding enough qualified people is becoming a challenge — especially in less glamorous areas.


Good hardware without good software is a waste of silicon, but with so many new processors and accelerator architectures being created, and so many new skills required, companies are finding it hard to hire enough engineers with low-level software expertise to satisfy the demand.

Writing compilers, mappers and optimization software does not have the same level of pizazz as developing new AI algorithms, or a new smart phone app. But without them, the whole industry will suffer.

Universities are struggling to turn out enough graduates with the mix of skills the industry needs. Part of the problem is the skill range is expanding to the point where it is unreasonable to accomplish that without several degrees. For example, new graduates may know how to create an AI algorithm, but few understand the implication of making it run efficiently on an edge device, or how to debug a system that is providing incorrect results.

There is a lack of software people to start with. “You will never have enough programmers,” says Robert Owen, principal consultant for the Worldwide University Programme at Imagination Technologies. “The amount of software you can make could always be more. There’s always extra functions you can add, there’s always more testing, more verification, better tools you can create. The demand for programming in the broadest sense is growing, and will continue to grow for the foreseeable future.”

But with new hardware, additional firmware people are necessary. “How do you make sure you can leverage the hardware in the best possible way? That is a huge issue, and not just a skills gap,” says Anoop Saha, senior manager for strategy and business development at Siemens EDA. “It is not just a knowledge gap. Fundamentally it is a hardware/software co-design problem that has to be solved. It used to be that hardware would be designed and used by the software using a fairly standard interface. Now it goes beyond that. Hardware and software have so many interfaces and so many layers in between, that breaking down organizational silos is extremely vital.”

Hardware without software is useless
Many hopeful startups have learned that no matter how good their hardware, if they do not invest enough in software, or fail to make the product easy enough to use, they will fail. “The early success that TI got in digital signal processing (DSP) was because of the quality of its compiler,” says Owen. “Looking back, the compiler wasn’t very good at all, but it was the best one. That gave TI a competitive edge and gradually the compilers improved.”

Back then, it required new skills. “I was working in the image compression domain 20 years ago,” says Benoit de Lescure, CTO for Arteris IP. “We had a similar issue, both with off-the-shelf DSP, and our own specialized DSP. Finding people to make efficient use of the SIMD hardware was extremely difficult, because you need to ‘think parallel’. But few people are ready to deep-dive into a hardware architecture that might be obsolete two or three years from now.”

Still, everyone knows they have to find those people. “It doesn’t matter how good the hardware substrate is, if it’s a nightmare to use, people won’t,” says Tim Atherton, director of research for AI at Imagination. “A great deal of the cost and time goes into the software. It is the software that sits above the hardware that is critical to having a product that will be a success.”

We have seen more recent examples of success. “Look at what Nvidia did with CUDA,” says Elias Fallon, software engineering group director for the Custom IC & PCB Group at Cadence. “So much of their success is around making a programming interface that lets people not worry too much about the hardware. Write your code this way, use these libraries, and you will get massive improvements in performance. All accelerators have to come up with something similar.”

The industry is facing the next challenge. “Semiconductor companies that have created innovative processor architectures now face that challenge,” says Ian Ferguson, vice president of sales and marketing for Lynx Software Technologies. “How can they help programmers harness these components for a diverse set of workloads and frameworks that are changing?”

Unlike in the past, it is not always the hardware architectures that are leading. “Two years ago, Bidirectional Encoder Representations from Transformers (BERT) and Enhanced Language Representation with Informative Entities (ERNIE) were largely just academic papers,” notes Ferguson. “But now, these models are used for a number of natural language processing applications. This will have implications on hardware processor architectures and toolchains for some time to come.”

AI is a game changer
While many college grads are armed with knowledge about AI, few know what it takes to map that onto hardware. “There is a distinct difference in skill sets between ML developers who create and train neural nets, and embedded programmers accustomed to optimizing algorithms or application code for embedded platforms,” says Steve Roddy, vice president of product marketing for the Machine Learning Group at Arm. “It is a fallacy to think that vast numbers of either group can be rapidly retrained to bridge the gap, as the skills in question take years to build and master.”

That gap needs to be filled with both people and tools. “Currently there is a gap between the high-level data scientists, who also work on high-level tools, and the specific hardware implementation,” says Andy Heinig, group leader for advanced system integration and department head for efficient electronics at Fraunhofer IIS’ Engineering of Adaptive Systems Division. “For this reason, new design methodologies are necessary that support the high-level tools of the data scientist and transform their knowledge into the highly optimized hardware. In addition, design methodologies are necessary that can be used to compare different high-level implementations on specific hardware under real world conditions.”

Those comparisons are not simple 1:1 mapping problems. “The successful AI chips will be those that both satisfy two sets of criteria,” says Geoff Tate, CEO of Flex Logix. “First, they have to deliver more inferences per second, per dollar, and per watt, at high quality for the customer’s neural network model. The customer has a dollar budget and a power budget and wants the most performance they can get within their budget. And second, they want to be able to re-use software. If the customer has to do detailed, low-level, hardware-specific programming, it will increase their cost of AI development and delay their schedules. It also will make it very hard to evaluate new architectures.”

Traditional CPUs used compilers to create an optimized mapping, using just a few switches to modify how the optimizations were performed. Even for CPUs, this is proving to be highly limiting because that approach does not take into account power optimization, code size optimization, or a host of things — other than performance — that are becoming more critical today.

“For CPUs, including high-performance aspects of their architectures, the details have largely been abstracted away,” says Imagination’s, Atherton. “Designing new networks that are very large and sophisticated programs, and usually appear as graphs, and mapping those onto dozens of different types of architectures is going to be very difficult to do by hand, if not impossible.”

There are fundamental differences in these architectures. “When people talk about non-von Neumann architectures, where there’s a separation of memory and execution, I understand that as an electrical engineer,” says Cadence’s Fallon. “But when I start thinking about how to write code, or how to adapt code to be more data-centric, that creates more problems. There’s a big gap there, and a lot of accelerator companies will struggle with it.”

The industry is working on frameworks to make this possible. One example shown in figure 1 is Tensor Virtual Machine (TVM), a hierarchical multi-tier compiler stack and runtime system for deep learning, where most of it is the same, no matter what the back-end hardware.

Fig. 1: Open Compiler for AI Frameworks: Source: Apache Software Foundation

Fig. 1: Open Compiler for AI Frameworks: Source: Apache Software Foundation

The skill set
Given the number of unique hardware architectures being created, that means a lot of different back ends have to be written. Those people need a broad range of skills. “They need to be software literate, they need to be hardware literate, they need to be computer-hardware-literate, they have to understand the optimizations you might apply to a graph that describes a particular neural network,” says Atherton. “This is not a skill set that you find in many people. The way we cope is to have the combined skill set in several research engineers, some with Ph.D.s, who together make the complete picture. But trying to get all of that in one person is hard.”

It requires thinking about the problem differently than in the past. “We have to break down the organizational silos,” says Siemens’ Saha. “Today, you have the hardware team where everyone is an expert in hardware, and the same with the software team. But what we need is a combined team that consists of a certain number of hardware engineers and software engineers, so the combined team knowledge is useful.”

Could there be one person who understands it all? “The ideal background for these people would be that they had a degree in mathematics, probably also in computer science, and maybe they’ve done some computer engineering, as well, so they have an understanding of the hardware architectures,” says Owen. “Having done those three degrees, then they would be able to apply those skills and nicely connect the worlds of something like Tensorflow through an efficient compiler all the way to map onto an architecture.”

Such a person would come at a very high price. “The entire industry is facing challenges because learning programming is a basic skill,” says Saha. “It has to be a ‘101’ course. Then you can specialize into various domains. What we also need is knowledge about algorithms and data science. A knowledge of statistics, mathematical modeling, and data science is more fundamental than machine learning. This is across all domains, and equally applicable to the needs of EDA. EDA is primarily an optimization problem for the hardware industry. We have always managed to bridge that gap with people who have good programming skills, but they also understand hardware design and what it takes to create good hardware.”

Many universities are ramping up their AI/ML courses. “The supply is limited and there is huge demand for the output of the universities,” says Owen. “The universities are aware they should be producing more graduates of this type and they are trying to do it. It may take a few years to develop the necessary skills on top of the university degrees, but I am not negative about it, and they will be highly paid.”

Let us not forget, that it often the universities that create change. “Universities have been key in driving the change towards non-traditional compute architectures over the last 10 years,” says Alex Grbic, VP of software engineering for Untether AI. “In addition to being a hotbed of innovation in these areas, universities have also been ramping up the number of degree programs and graduates in machine learning / deep learning. In particular, universities in the Toronto area, Ontario and Canada have seen the need and are addressing it.”

However, even this does not address the needs of the semiconductor industry. “Machine learning is really cool, but how do I make it work on an edge device?” asks Fallon. “How do I make it work when I only have fixed-point math? How can I make it better when I want to be able to really squeeze the network down? How can I make it as small a network as possible, and rather than increasing the accuracy by 1%? How do I investigate the potential for decreasing the accuracy by 1% if it only consumes 1/10 of the area?”

This is the practical side of AI deployment. “There is a gap looming,” says Lynx’s Ferguson. “Programmers are writing code in high-level languages, assuming infinite memory and CPU cycles provided by the cloud. I see a gap for people to create optimized applications for more custom applications, especially resource-constrained ones where power or processing is limited. While the TinyML effort has helped greatly, there is still a gap.”

New courses are being designed. “We are putting together a course called edge AI principles and practices,” says Owen. “It is aimed at undergraduates. As well as covering the basics, it will enable students to do some exercises — not just looking at images and segmenting them, but also things like speech applications. We almost take speech recognition, speech translation, natural speech creation, and more, for granted on the edge these days because they are becoming embedded in almost everything.”

Tools are a necessary part of it. “The answer lies in building toolsets that guide and automate the migration of AI workloads from their inception in the cloud with infinite numerical precision and compute resource, into inference deployment within constrained compute devices,” says Arm’s Roddy. “For embedded developers, mapping a pre-trained, quantized model to a target hardware, a series of optimization tools are required that are specialized to a particular target. They are optimizing data flows, compressing model weights, merging operators to save bandwidth, and more.”

Someone has to write the tools. “Power, performance and area have driven things in the past,” says Atherton. “AI has added a fourth — bandwidth. Neural networks really chew up bandwidth. You have to get data in and out, you have to modify the architecture to improve bandwidth, minimize the area, and power has always been important while pushing performance up.”

Invisible problems
Companies also face a few additional problems that are rarely talked about. “There is another factor that is impacted us,” says Saha. “When we release a product that has AI capabilities, what happens when it doesn’t work? How do you figure out what the issue is? You cannot just run the debugger and figure out that this piece of the code is not working. You need to figure out what was missing. Was it the algorithm? Was it the data? Was it the application? It could be a much wider range of things. So now, the support people have to be data scientists and be able to understand where the problem may be found.”

Hiring for many is a challenge. “Given that AI has such a broad applicability among programmers, many are pursuing careers at companies with household names like Apple, Google and Tesla,” says Ferguson. “While these organizations have a healthy flow of hires, it leaves other industries struggling to fill the openings in their own organizations.”

EDA has to compete for those same people. “The type of ECE grads who understand enough about chips, and how the hardware really works, and have some software and machine learning knowledge, are in high demand,” says Fallon. “These are the same people who are wanted for machine learning jobs all over the industry, and so we are going to see a lot of a crunch there.”

Being a firmware or low-level software engineer has never been glamorous. These people are rarely given the credit for what they accomplish, and the industry has seen many examples where a failure of this software brings down a company. While the universities are ramping up to meet the demands of high-profile AI/ML companies, the development of courses for the more mundane problems of making them useable still seems a long way off.

For those who have the necessary mix of skills and do not seek fame, fortune may be the reward.

Hard-To-Hire Engineering Jobs
The crunch to find skilled engineers goes fully global.
Stretching Engineers
The role of engineers is changing, and they need to be picking up new skills if they are to remain valuable team players. There are several directions they could go in.
Looking for a job in the semiconductor industry?
Chip industry’s worldwide jobs board
Test Engineers In Very Short Supply
Why these jobs are so difficult to fill.
Engineering Talent Shortage Now Top Risk Factor
New market opportunities and global competitiveness are limited by qualified people.

Leave a Reply

(Note: This name will be displayed publicly)