When companies describe their AI inference chip they typically give TOPS but don’t talk about their memory system, which is equally important.
What is TOPS? It means Trillions or Tera Operations per Second. It is primarily a measure of the maximum achievable throughput but not a measure of actual throughput. Most operations are MACs (multiply/accumulates), so TOPS = (number of MAC units) x...
» read more