Packing Neural Networks Into End-User Client Devices


Most of today’s neural networks can only run on high-performance servers. There’s a big push to change this and simplify network processing to the point where the algorithms can run on end-user client devices. One approach is to eliminate complexity by replacing floating-point representation with fixed-point representation. We take a different approach, and recommend a mix of the two, so as... » read more