Alibaba, Tencent lead pivot from chatbots to embodied AI for robotics
Chinese tech companies are racing to deploy artificial intelligence models into robots, shifting the battleground for generative AI from digital chatbots to physical autonomous systems.
The race in Chinese artificial intelligence has taken a decisive physical turn. Alibaba and Tencent are leading a push to embed generative AI into robots, moving the battlefield from text-generating chatbots to machines that can navigate, grip, and clean. The logic is straightforward: the real commercial value of large language models may lie not in answering questions, but in controlling hardware that interacts with the physical world. Alibaba’s latest Qwen3.7-Max model, released last week, is built around “tool-calling” capabilities that let the AI function as a digital brain, triggering external software and hardware components. The company has simultaneously released a suite of supporting AI models for robotics, including a robotic gripper agent, a navigation model, and a vision-language system designed for physical-world interaction. This is not a side project. It signals a strategic bet that the next frontier for AI is embodied, not disembodied. The bottleneck is data. A co-founder of AgiBot, a leading Chinese embodied AI startup, recently highlighted the scale of the problem. While GPT-5 was trained on data equivalent to roughly 10 billion hours, the entire robotics industry had access to only about 500,000 hours of high-quality embodied AI data. That gap explains why progress in physical AI has lagged behind language models. A Goldman Sachs report this week identified “high-quality real-world data” as one of the primary constraints in the field. Chinese startups are already moving to bridge that gap. X Square Robot announced last week it had partnered with 58 Daojia, a home-services platform, to launch robot-assisted household cleaning services in Beijing and Shenzhen. This is a real-world deployment, not a lab demo. The robots are learning from actual homes, generating the kind of messy, unstructured data that pure simulations cannot replicate. What a casual observer might miss is how this shift reshapes competition. In chatbots, the advantage goes to whoever has the most text data and compute. In embodied AI, the winners will be those who control the physical environments where robots operate. Alibaba’s logistics network, Tencent’s gaming engines for simulation, and Meituan’s delivery infrastructure all become strategic assets. The battleground is no longer just about model parameters, but about access to real-world feedback loops. Physical AI, as one industry veteran put it, is about how AI understands the real world and completes tasks in real-world environments. That demands more than a chatbot’s ability to generate plausible sentences. It requires models that can handle uncertainty, adapt to changing conditions, and learn from physical consequences. The technical challenges are immense, but the payoff—a machine that can truly operate alongside humans—is orders of magnitude larger than any language model. The pivot from chatbots to robots is not a retreat from generative AI. It is an expansion of what generative AI can do. The next phase of the race will be defined not by who builds the smartest digital assistant, but by who can make that intelligence walk, grab, and clean.
Chinese tech companies are racing to deploy artificial intelligence models into robots, shifting the battleground for generative AI from digital chatbots to physical autonomous systems.
Alibaba and Tencent are betting that generative AI’s real value lies in controlling physical robots, not just answering questions.
The development adds to a wider Greater China ai & machine learning story in which companies are being judged on execution, capital access, regulatory fit and the credibility of their regional expansion plans.
For business readers, the important question is whether this becomes an isolated announcement or part of a more durable operating pattern across customers, financing channels, partners and public-market expectations.