SM2: A Deep Neural Network Accelerator In 28nm

How Harvard researchers created a programmable chip for IoT applications based on an embedded FPGA.


Deep learning algorithms present an exciting opportunity for efficient VLSI implementations due to several useful properties: (1) an embarrassingly parallel dataflow graph, (2) significant sparsity in model parameters and intermediate results, and (3) resilience to noisy computation and storage. Exploiting these characteristics can offer significantly improved performance and energy efficiency. We have taped out two SoCs, one in 28nm bulk and one in 16nm FinFET. These chips contain CPUs, peripherals, on- chip memory and custom accelerators to allow us to tune and characterize the efficiency and resilience of deep learning algorithms in custom silicon.

To read more, click here.

Leave a Reply