Getting A Complete Picture Of Automotive Software
How tools can help address complex automotive software architectures efficiently.
The automotive industry is currently undergoing a major disruption, usually referred as the shift to automated, connected, electric, and shared vehicles (ACES[1]). Naturally, these changes also have a significant impact on the requirements of the hard- and software architectures of these new vehicles:
- Service-oriented software architectures used by multiple applications running on generalized computing platforms instead of hundreds of distributed ECUs
- Introduction of new middlewares such as the second version of the robotic operating system (ROS 2[2]) or Adaptive AUTOSAR (AA[3])
- The shift from C to C++ in software development (e.g. both ROS 2 and AA are based on C++14)
- Data-driven algorithms require the shift from static to dynamic scheduling
- More extensive use of heterogeneous computing platforms (e.g. based on field programmable gate arrays (FPGAs), graphics processing units (GPUs), neural network (NN) accelerators, etc.) and corresponding programming languages (HLS, CUDA, OpenCL, SYCL, Halide, etc.)
While the requirements and associated challenges have become more clear, tooling and methodologies to address them efficiently and in a scalable manner are still lacking.
Static and dynamic source code analysis
Currently, code analysis tools can roughly be classified into two categories:
- Static Analysis: For static analysis, typically the source or object code of an application is analyzed directly (without executing it). Depending on the tools that are used, the resulting analysis can be fairly advanced (especially when formal methods can be used to prove certain properties of the code), but typically only a subset of the code can be analyzed completely statically. For code sections that cannot be analyzed statically, a conservative approach to classify the severity of potential hazards can lead to many false positives. Static analysis tools can for example be used to ensure that code adheres to certain standards, e.g. defined by the Motor Industry Software Reliability Association (MISRA[4]).
- Dynamic Analysis: For dynamic analysis, an application must be executed in a way that represents the scenario that is of interest (e.g. by providing corresponding inputs). As a result, the results that are acquired are not necessarily representative for the application but can be highly correlated to the selected scenario of interest. Deriving (formal) guarantees based on dynamic analysis is typically difficult. Nevertheless, dynamic analysis can provide a good understanding of the execution behavior of an application and is frequently used e.g. in performance optimization.
While static and dynamic analysis tools can be helpful in analyzing and debugging an application, they only provide a limited overview of the software architecture and allow a limited view into data flow and dependencies. To overcome those limitations, Silexica combines static and dynamic analysis based on a decade of compiler research. In addition, semantic analysis is used to derive a complete picture of the software architecture that encompasses both the logical software architecture and the dynamic execution behaviour. This analysis can be used to prevent architecture erosion and provide actionable insights to optimize the software architecture and implementation.
Automotive use cases
This underlying core technology can be used for several use cases with immediate benefits for software architects and developers:
- Provide deep application insights into multi-binary and multi-threaded applications to prevent software architecture erosion: The Silexica programming tools (SLX) visualize thread genealogy and concurrency, communication, synchronization, and data dependencies, providing a live architectural overview from the source code which can be checked for consistency with the envisioned architecture to prevent software architecture erosion. It can also be used extensively for collaboration, onboarding, and documentation.
- Identify data races across applications and understand communication and memory bottlenecks. Shared memory analysis (currently for POSIX shared memory) makes you aware of how the application communicates among threads and with other applications. SLX shows all accesses to variables including sub-objects of arrays and structs even when accessed through pointers. This architectural overview based on the actual source code can be used in functional debugging and code refactoring. Additionally, protection analysis identifies missing inter-thread and inter-process shared memory protection (semaphores, mutexes) that may lead to data corruption. This enables not only the detection of data races in between threads, but also in between separate processes and applications.
- Optimize software distribution for the hardware compute blocks on a heterogeneous multicore system by performing fast “what-if” analysis to visualize code execution. Optimizations are driven by performance, power and memory requirements given a combination of CPUs, DSPs, FPGAs described in a standardized hardware description format (SHIM2[5]) and new innovative simulation capabilities. Both dynamic distribution as well as static mapping optimizations are supported.
- Actionable insights for improved software architecture and performance. Save development time by automatically identifying optimization opportunities in the code. SLX provides guidance to assist code refactoring for improved performance. Examples include data structure, synchronization, and loop optimizations. Additionally, the technology provides automated parallelism detection as well as pinpoints blocking conditions for potential parallelism. Different levels of parallelism are supported including task, pipeline, and data level parallelism.
SLX already supports the above use cases for POSIX environments. We are now working on integration with future middlewares as well, such as Adaptive Autosar or ROS 2, a rewrite of ROS to enable production environments in real-time systems and embedded platforms. There are also forks of ROS 2 available such as Apex.OS[6], a ROS 2 API-compatible distribution aimed at safety-critical applications. It is currently being certified according to the automotive functional safety standard ISO 26262 as a Safety Element out of Context (SEooC) up to ASIL D.
ROS integration
For ROS (both in version one and two) there already exists an extensive open-source ecosystem of tools such as Rviz, RQT graph, Gazebo, etc. that help in developing autonomous systems, but they do not help in tackling the use cases described above. The SLX integration for ROS (which is currently under development) will allow a smooth integration of SLX into regular ROS development flows (e.g. based on the aforementioned tools). For this, we are currently extending SLX:
- Full ROS 1/ROS 2 build system integration:
- On-demand analysis of individual modules
- Support for parallel analysis of multiple modules
- Full CI integration support to allow automated regression testing, etc.
- Automated root-cause analysis for application variability with full backwards traceability back to the original source code line. This helps to understand why different runs of the same software behave differently. Reasons include variances of different execution path due to different input data or state, different execution length due to e.g. recursive algorithms or variances in execution behavior due to different dynamic scheduling permutations.
The video below shows a first look at how an integration of ROS into SLX could look like:
- Run your ROS scenario, here:
- Autoware[7] v1.8 (patched to increase stability)
- Based on “Moriyama” dataset
- Isolated GPS test scenario (runtime: ~10 seconds)
- Rate of GPS sensor (given by ROSbag): 25 Hz
- Environment: Custom Nvidia Docker container (which includes the full Autoware stack and SLX)
- Get an overview of the entire system with the SLX System Overview chart that fuses operating system profiling information (number of context switches, kernel/process runtimes, idle times, etc.) with ROS-specific information (e.g. active modules/topics, dependencies, etc.).
- Select a module for in-depth SLX analysis (here the Autoware “GPS” module: nmea2tfpose)
- Survey thread, function, and variable dependencies to get an overview over the software architecture
- Sorting the results reveals a significant amount of synchronization calls (especially given the required module frequency (25 Hz) and scenario runtime (10 s))
- Directly jump to the source code to find the root cause for the synchronization calls and replace it with a code segment that should reduce the synchronization calls
- Re-running the scenario sees a significant reduction in synchronization calls
https://youtu.be/HjIylwsWd50
While the scenario is quite simple and the proposed code change trivial[8], the example illustrates the game changing potential of a ROS SLX integration for future software development. This holds especially true for larger development teams with multiple in-house ROS modules and a software architecture too complex to grasp using tools that are currently available in the market.
References
[1] Center for Automotive Research (https://www.cargroup.org)
[2] Robot Operating System 2 (https://index.ros.org/doc/ros2)
[3] AUTOSAR Adaptive (https://www.autosar.org/standards/adaptive-platform)
[4] Motor industry software reliability association (https://misra.org.uk)
[5] Multicore Association SHIM2 (https://multicore-association.org/workgroup/shim.php)
[6] Apex.OS by Apex.AI: https://www.apex.ai/products
[7] Autoware (https://github.com/CPFL/Autoware, v1.8)
[8] It could even be argued that the proposed change is not portable and that the usage of ros::Rate would be more fitting, although the mechanism that keeps the update rate stable also incurs certain costs
Maximilian Odendahl
(all posts)
Maximilian Odendahl is the CEO and co-founder of Silexica. He was formerly the chief engineer of the Chair for Software for Systems on Silicon leading 15 research assistants. Odendahl received a Computer Engineering diploma from RWTH Aachen University in 2010. He has had more than 20 publications published in international computing conferences and journals.
Leave a Reply