Home

TECHNICAL PAPERS

FP8: Cross-Industry Hardware Specification For AI Training And Inference (Arm, Intel, Nvidia)

September 16th, 2022 - By: Technical Paper Link

Arm, Intel, and Nvidia proposed a specification for an 8-bit floating point (FP8) format that could provide a common interchangeable format that works for both AI training and inference and allow AI models to operate and perform consistently across hardware platforms.

Find the technical paper titled ” FP8 Formats For Deep Learning” here. Published Sept 2022.

Abstract:
“FP8 is a natural progression for accelerating deep learning training inference beyond the 16-bit formats common in modern processors. In this paper we propose an 8-bit floating point (FP8) binary interchange format consisting of two encodings – E4M3 (4-bit exponent and 3-bit mantissa) and E5M2 (5-bit exponent and 2-bit mantissa). While E5M2 follows IEEE 754 conventions for representation of special values, E4M3’s dynamic range is extended by not representing infinities and having only one mantissa bit-pattern for NaNs. We demonstrate the efficacy of the FP8 format on a variety of image and language tasks, effectively matching the result quality achieved by 16-bit training sessions. Our study covers the main modern neural network architectures – CNNs, RNNs, and Transformer-based models, leaving all the hyperparameters unchanged from the 16-bit baseline training sessions. Our training experiments include large, up to 175B parameter, language models. We also examine FP8 post-training-quantization of language models trained using 16-bit formats that resisted fixed point int8 quantization.”

Authors: Paulius Micikevicius, Dusan Stosic, Neil Burgess, Marius Cornea, Pradeep Dubey, Richard Grisenthwaite, Sangwon Ha, Alexander Heinecke, Patrick Judd, John Kamalu, Naveen Mellempudi, Stuart Oberman, Mohammad Shoeybi, Michael Siu, Hao Wu.

Citation: arXiv:2209.05433v1.

FP8: Cross-Industry Hardware Specification For AI Training And Inference (Arm, Intel, Nvidia)

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Intel Vs. Samsung Vs. TSMC

Electromigration Concerns Grow In Advanced Packages

The Race To Glass Substrates

What Works Best For Chiplets

Controlling Warpage In Advanced Packages

Electrically Controlled All-AFM Tunnel Junctions on Silicon with Large Room-Temperature Magnetoresistance (Northwestern)

Architecting Chips For High-Performance Computing

EDA Looks Beyond Chips

Sponsors

Recent Comments

About

Navigation

Connect With Us

FP8: Cross-Industry Hardware Specification For AI Training And Inference (Arm, Intel, Nvidia)

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Intel Vs. Samsung Vs. TSMC

Electromigration Concerns Grow In Advanced Packages

The Race To Glass Substrates

What Works Best For Chiplets

Controlling Warpage In Advanced Packages

Electrically Controlled All-AFM Tunnel Junctions on Silicon with Large Room-Temperature Magnetoresistance (Northwestern)

Architecting Chips For High-Performance Computing

EDA Looks Beyond Chips

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored