Disaggregating And Extending Operating Systems


The push toward disaggregation and customization in hardware is starting to be mirrored on the software side, where operating systems are becoming smaller and more targeted, supplemented with additional software that can be optimized for different functions. There are two main causes for this shift. The first is rising demand for highly optimized and increasingly heterogeneous designs, which... » read more

Embedded AI On L-Series Cores


Over the last few years there has been an important shift from cloud-level to device-level AI processing. The ability to run AI/ML tasks becomes a must-have when selecting an SoC or MCU for IoT and IIoT applications. Embedded devices are typically resource-constrained, making it difficult to run AI algorithms on embedded platforms. This paper looks at what could make it easier from a softwar... » read more

Why TinyML Is Such A Big Deal


While machine-learning (ML) development activity most visibly focuses on high-power solutions in the cloud or medium-powered solutions at the edge, there is another collection of activity aimed at implementing machine learning on severely resource-constrained systems. Known as TinyML, it’s both a concept and an organization — and it has acquired significant momentum over the last year or... » read more

Developers Turn To Analog For Neural Nets


Machine-learning (ML) solutions are proliferating across a wide variety of industries, but the overwhelming majority of the commercial implementations still rely on digital logic for their solution. With the exception of in-memory computing, analog solutions mostly have been restricted to universities and attempts at neuromorphic computing. However, that’s starting to change. “Everyon... » read more

Compiling And Optimizing Neural Nets


Edge inference engines often run a slimmed-down real-time engine that interprets a neural-network model, invoking kernels as it goes. But higher performance can be achieved by pre-compiling the model and running it directly, with no interpretation — as long as the use case permits it. At compile time, optimizations are possible that wouldn’t be available if interpreting. By quantizing au... » read more

The Evolution Of High-Level Synthesis


High-level synthesis is getting yet another chance to shine, this time from new markets and new technology nodes. But it's still unclear how fully this technology will be used. Despite gains, it remains unlikely to replace the incumbent RTL design methodology for most of the chip, as originally expected. Seen as the foundational technology for the next generation of EDA companies around the ... » read more

Week In Review: Design, Low Power


Tools & IP Arm unveiled several new processor IPs. Targeting next-gen smartphones, the Cortex-A78 CPU provides a 20% increase in sustained performance over Cortex-A77-based devices within a 1-watt power budget, and more efficient management of compute workloads and on-device ML. The Mali-G78 GPU provides a 25% increase in performance over the Malti-G77. It supports up to 24 cores and in... » read more

What Machine Learning Can Do In Fabs


Semiconductor Engineering sat down to discuss the issues and challenges with machine learning in semiconductor manufacturing with Kurt Ronse, director of the advanced lithography program at Imec; Yudong Hao, senior director of marketing at Onto Innovation; Romain Roux, data scientist at Mycronic; and Aki Fujimura, chief executive of D2S. What follows are excerpts of that conversation. L-R:... » read more

Improving Algorithms With High-Level Synthesis


Most computer algorithms today are developed in high-level languages on general-purpose computers. But someday they may be deployed in embedded systems where the development, verification, and validation of algorithms is done in languages like python, Java, C++, or even numerical frameworks like MatLab. This is the goal of high-level synthesis (HLS), and it aims to solve a fundamental proble... » read more

Checkmate: Breaking The Memory Wall With Optimal Tensor Rematerialization


Source: Published on arXiv 10/7/ 2019   Paras Jain Ajay Jain Aniruddha Nrusimha Amir Gholami Pieter Abbeel Kurt Keutzer Ion Stoica Joseph E. Gonzalez A recent paper published on arXiv by a team of UC Berkeley researchers notes that neural networks are increasingly impeded by the limited capacity of on-device GPU memory. The UC Berkeley team uses off-the-shel... » read more

← Older posts