What Mediatek’s new flagship Dimensity 9400 tells us about where AI is headed on mobile
As is typical, my latest “next big AI startup” idea was AI agents on the phone. Basically I heard Chamath on the All In Podcast say that AI agents have to evolve to a practical use cases and he asked the right question as an example:
Book me the cheapest cab to downtown!
This is actually a good AI agent problem to build. There are two approaches:
Prerequisites for AI Agent on Mobile
- Ideally an on-device LLM for fast inferencing and low latency
- Fine-tuned for mobile-specific context such as apps and interactions. For above command, it should know that the cab apps are Uber and Careem in the UAE for example
- The right permissions – Now how does the agent actually take these actions? In our case we need the agent to:
- Figure out what the cab apps are
- Have location permission to figure out the region the user is in to understand local apps
- Now here’s an interesting conundrum – How does the agent actually talk to the apps themselves? Direct API integrations will be tedious with the need to constantly update authentication, update integrations with new apps and features popping up all the time
- The most optimum approach – Have a super agent with the same user-level permissions to view screen recordings, draw over other apps and interact with any app components such as buttons , fields and even payments on behalf of the user with full approval and transparency
In terms of feasibility, it was only a matter of time because there were already several browser-based agents doing the same automated interactions on behalf of the user, some examples I found on YouTube:
Also saw something about a tool called HyperWrite AI but could not find the tool more recently:
But the NPU on the latest Dimensity 9400 accelerates this
There are several AI-focused features already:
- The 1st high quality on-device video generation
- 1st on-device LoRA training (image generation not the IoT protocol 😭)
- 80% faster LLM prompt performance
- MediaTek claims its chips can do around 50 tokens per second for AI computation compared to Qualcomm’s present gen Snapdragon 8 Gen 3’s 20 tokens per second
- But most importantly.. there is an announcement for the world’s 1st Agentic AI Engine?
But what does this really mean? How can something at the chip level be “Agentic” without an OS-level integrations to run a LLM and execute Agentic actions. This would surely need an integration with Google to run at a Android-level, wait a minute..
So we have a clear idea now of what 2025 Android Flagships would look like:
- On-device LLMs will be the rage. There will be a fragmentation with the LLMs vying for deeper integrations with OEMs. Apple has built it’s own flavor with Apple Intelligence, but there will be a race to integrate smaller models fine-tuned for mobile usage for the major OEMs. Another headstart for Google here.
- Everything AI will be faster – with custom-designed NPUs specially optimized for AI workloads everything from image generation to running multi modal LLMs. Longer context, SLM+LLM hybrid models, multi-modality etc. will be key features in all upcoming chipsets including the upcoming Snapdragon 8 Gen 4.
- AI Agents come to Life – With a combination of custom chips and OS-level integrations, we will finally see viable AI agents capable of orchestrating end-to-end
Mobile phones will be the next big platform to be infused with AI, and 2025 will be the year of AI on mobile!