A new technical paper titled "Intelligence per Watt: Measuring Intelligence Efficiency of Local AI" was published by researchers at Stanford University and Together AI.
Abstract:
"Large language model (LLM) queries are predominantly processed by frontier models in centralized cloud infrastructure. Rapidly growing demand strains this paradigm, and cloud providers struggle to scale infrastruc...
» read more