Essentially all AI training is done with 32-bit floating point.
But doing AI inference with 32-bit floating point is expensive, power-hungry and slow.
And quantizing models for 8-bit-integer, which is very fast and lowest power, is a major investment of money, scarce resources and time.
Now BFloat16 (BF16) offers an attractive balance for many users. BFloat16 offers essentially t...
» read more