![]() |
| Fig. 1: Data center in Virginia. (Source: Wikimedia Commons) |
Every time we ask a question online, stream a movie, or generate an image with artificial intelligence, electricity flows through a physical facility called a data center - a warehouse of servers, storage, and cooling equipment that keeps the digital world running (see Fig. 1).
The rapid rise of generative AI, especially large language models (LLMs) such as GPT, LLaMA, and Gemini, has driven an unprecedented increase in data-center power demand. Global data centers consumed about 415 TWh of electricity in 2024 - roughly 1.5% of total world demand. [1]
Data-center power use can be grouped into three broad categories:
Training is the development of models by adjusting billions of parameters on large datasets. Training an AI model ultimately draws power in proportion to the number of floating-point operations (FLOPs) executed and the energy cost per operation. A FLOP simply represents one mathematical step involving real numbers, such as adding, multiplying, or dividing two decimal values.
Modern GPUs perform computation at about ε ≈ 2 × 10-10 J/FLOP. [2] A GPT-4 scale model requires roughly NFLOP ≈ 1024 floating-point operations. [3] Hence, the total energy for one full training run is:
| E | = | NFLOP × ε | ≈ | 1024 FLOP × 2 × 10-10 J FLOP-1 | = | 2 × 1014 J |
Expressed as GWh this becomes:
| 2 × 1014 J 3.6 × 106 J/kWh |
= | 5.6 × 107 kWh | = | 56 GWh |
This single training cycle therefore consumes on the order of hundreds of gigawatt-hours, comparable to the annual electricity use of a small city.
Inference refers to the use of trained models to generate responses, images, or predictions for users. The electricity required for a single AI inference (a model generating a response to an input) depends on the model size, architecture, prompt length, and task complexity.
As an example, ChatGPT consumes roughly 2.9 Wh per request, but this differs greatly based on model. [4] More complex workloads such as image generation or multimodal reasoning can demand 1020 Wh per inference, whereas lightweight text-classification tasks consume orders of magnitude less. [5]
Everything else includes things such as cooling, backup power, and other auxiliary systems that keep the servers operating safely. In AI-optimized or hyperscale data centers, not all electricity powers computation. A significant share is consumed by systems that keep the servers operational cooling equipment, power distribution units, and backup infrastructure.
Energy efficiency is measured using the metric Power Usage Effectiveness (PUE), defined as PUE = Ptotal / PIT, where Ptotal is the total facility electricity use and PIT is the power delivered to servers. A PUE of 1.0 would mean every joule goes to computing; higher values indicate additional energy for cooling and auxiliaries. The average hyperscale data center operates around PUE ≈ 1.4. [6] Thus, the fraction of electricity used directly for IT equipment is:
| PIT Ptotal |
= | 1 PUE |
= | 1 1.4 |
≈ 0.71 |
The IEA (2024) projects that global data-centre electricity demand will rise from ≈ 415 TWh in 2024 to between 700 TWh and 1,700 TWh by 2035, depending on technology progress and AI growth. [1] Their four scenarios are:
Base Case: 1,000 TWh by 2035 - This follows current trends in server shipments and efficiency.
Lift-Off: 1,700 TWh. -This assumes rapid AI expansion and faster data-centre deployment.
High-Efficiency: 900 TWh - This assumes similar demand for digital services but stronger hardware/software efficiency.
Headwinds: 700 TWh - This assumes slower AI adoption and local supply-chain constraints.
This equates to a CAGR range of around 5-14%. This growth rate estimate is similar to other estimates from EPRI and LBNL. [6,7]
Across all cases, the relationship follows the classic Koomey-style trend: hardware efficiency continues to improve, yet total computational demand expands even more, producing a net rise in energy use. [8] The physical efficiency frontier is approaching; PUE values are already near 1.3, and even a perfect PUE = 1.0 would save only
| 415 TWh × (1 - | 1 1.4 |
) | ≈ | 120 TWh |
This is only about one year of expected AI-related growth (under the IEA base case).
The story ends where it began: the world's data centers now draw hundreds of terawatt-hours each year to keep digital life running. Efficiency gains will help, but as Koomey's law reminds us, demand for computation keeps outpacing our ability to economize. Even in the cloud, the limits of physics still apply.
© Jeffrey Xia. The author warrants that the work is the author's own and that Stanford University provided no input other than typesetting and referencing guidelines. The author grants permission to copy, distribute and display this work in unaltered form, with attribution to the author, for noncommercial purposes only. All other rights, including commercial rights, are reserved to the author.
[1] "Energy and AI," International Energy Agency, April 2025.
[2] S. Shankar and A. Reuther, "Trends in Energy Estimates for Computing in AI/Machine Learning Accelerators, Supercomputers, and Compute-Intensive Applications," IEEE 9926296, IEEE High Performance Extreme Computing Conf., 19 Se0 22.
[3] G. Cheng, ChatGPT: Principles and Architecture (Elsevier, 2025), pp 187-283.
[4] A. de Vries, "The Growing Energy Footprint of Artificial Intelligence," Joule 7, 2191 (2023).
[5] A. S. Lucioni, Y. Jernite, and E. Strubell, "Power Hungry Processing: Watts Driving the Cost of AI Deployment?," ACM 3630106, Proc. 2024 ACM Conference on Fairness, Accountability, and Transparency, (Association for Computing Machinery, June 2024), p. 85.
[6] A. Shehabi et al., "2024 United States Data Center Energy Usage Report," Lawrence Berkeley National Laboratory, LBNL-2001637, December 2004.
[7] "Scaling Intelligence: The Exponential Growth of AI's Power Needs," Electric Power Research Institute, 2025.
[8] A. Prieto et al., "Evolution of Computing Energy Efficiency: Koomeys Law Revisited," Clust. Comput. 28, 42 (2025).