Moore’s Law just does not work in the AI era
While Moore’s Law predicted the exponential growth of transistor count, the real drivers of AI progress today are the AI scaling laws—and the growth is far faster. As Satya Nadella points out in the recent Microsoft AI Tour in London, AI models have been scaling at 4.1x per year, outpacing Moore’s Law. But scaling what? Well brute compute, which Azure and Nvidia are obviously super happy about but what about actual model performance?
This exponential growth in compute capacity is critical to advancing the performance of large models like GPT 4o, which rely on massive compute resources to achieve their current capabilities.However, the story doesn’t end with pre-training compute scaling. We’re beginning to witness a shift towards inference-time scaling, particularly with OpenAI’s o1 model. Inference scaling reallocates compute from training to inference, allowing models to engage in System 2 thinking—slower, more deliberate reasoning over extended periods of time. This shift enables AI systems to “think” and reason deeply about complex problems, such as writing intricate software or solving mathematical theorems, akin to human System 2 processing.
While pre-training compute has doubled every 6 months for the past 14 years, we’re also nearing a data wall by 2027-28. This means scaling up compute won’t be enough if we don’t address the limitations of data quality and availability. The breakthrough lies in leveraging inference time compute, where AI systems can deliberate longer on tasks like reasoning, planning, and problem-solving.
The promise of inference scaling is immense: it allows AI to tackle more complex, long-horizon tasks like strategic planning and financial analysis. With o1 models, we’re seeing AI capable of not just automating tasks but reasoning at scale—significantly lowering the cost of replacing high-skilled human workers in areas like software engineering or customer service
Ultimately, AI scaling isn’t just about brute-force compute or longer inference times—it’s about striking the right strategic trade-offs between compute, model efficiency, and specialized hardware. As scaling laws drive the exponential growth of AI capabilities, the focus will also shift toward hardware innovation like TPUs and 3D chip architectures to support these massive workloads sustainably.
The next frontier for AI will demand algorithmic breakthroughs that optimize both training and inference to unlock higher performance at a lower compute cost, ensuring that the economic and environmental footprint of AI remains viable.