The Chinese tech ecosystem is rapidly developing successful foundational models and deploying them at the edge.
China’s investment in Gen-AI is projected to surge with an estimated 86% CAGR over the next five years. This growth is driven by a focus on technological self-sufficiency, from applications to chips, and a strong emphasis on locally developed technology. Key areas of development include:
Chinese labs have contributed widely to the AI landscape, particularly in open-source AI ecosystems. These include:
China’s foundational models have are benchmarking well, with models like Qwen2.5-72B-instruct and GLM-4-plus showing significant improvements in instruction-following, long text generation, and structured data understanding. The latest Qwen 2.5 encompasses up to 18 trillion tokens and these models have been shown to be resilient to diverse system prompts, enhancing their utility in various applications.
Tencent’s Hunyuan-Large, with 389 billion parameters, 52 billion activation parameters and the ability to hold up to 256,000 tokens, stands out as the largest open-source transformer-based mixture-of-experts model and performs well in benchmarks for language understanding, logical reasoning, and more, outperforming many larger models.
China has an active open-source LLM community, with models like DeepSeek-V2, a mixture-of-experts language model running at 236 billion total parameters, well-regarded for economical training and efficient inference. These models support general conversational capabilities, robust code processing, and better alignment with human preferences.
Several influential models below 10 billion parameters are emerging such as GLM-4-9B-Chat and MiniCPM-2B, which perform well in Chinese-language tasks and other applications. When combined with the continuous maturation of compute hardware and software for edge devices, this enables the integration of these models into cars, robots, wearables and more. Taking smartphones as a specific edge example, many OEMs are integrating advanced AI models, some ranging up to 7 billion parameters in size, into their latest devices to enable advanced applications like image generation, text understanding and AI agents.
Why is all this effort going into creating smaller models that run on edge devices? Deploying Gen-AI at the edge rather than via the cloud enables faster response times, greater data privacy, personalized user experiences and lower cloud inference costs. Yet with many AI models still being compute-intensive, fully realizing the opportunity of AI at the edge is an ongoing engineering effort.
To achieve success, edge devices need to balance computational and memory constraints, power and energy limitations, and the heterogeneous nature of edge computing devices. In particular, AI at the edge needs to coexist with many other critical user facing and device management tasks without breaking the thermal capacity of the device. GPUs offer the programmability, flexibility, efficiency and compute performance to bring Gen-AI applications to the edge.
The technology ecosystem in China is rapidly developing successful foundational models and deploying them at the edge. With continuous innovation and a strong focus on localization and customization, consumers will soon see even more powerful AI capabilities coming to their devices.
Zack Zheng, Imagination’s director of product management in China, further explains the progress China has made in developing foundational models and their deployment at the edge, highlighting key trends, technological advancements, and the future outlook in this webinar. Visit our AI pages to find out more about Imagination’s solutions for Generative AI.
Leave a Reply