Powering AI Responsibly

Section 00

Key Takeaways.

Climate & Water Cost

AI infrastructure is locking in 30–40 years of fossil dependency through new gas plants while consuming billions of gallons of water for cooling. Evaporative systems use 3–5x more water than closed-loop alternatives, straining drought-prone communities that host data centers.

Economic & Grid Fragility

Data center demand is projected to consume up to 12% of total US electricity by 2028, already driving regional utility bill increases of $11–$16/month and capacity price spikes of 833% in critical markets like Virginia.

Geopolitical Vulnerability

While the US faces domestic renewable rollbacks and energy supply chain risks (evidenced by the 2026 Strait of Hormuz crisis), China is executing a "Dual-Track" strategy — surpassing 2030 renewable targets six years early and decoupling AI from urban grids via the "Eastern Data, Western Computing" initiative.

The Policy Path Forward

We must transition from "Phase 1" (unregulated growth) to "Phase 2" (Measurement and Transparency). Mandating granular, per-job energy reporting and incentivizing "temporal flexibility" (shifting training to peak renewable hours) is required to prevent 30-year fossil fuel infrastructure lock-in.

Section 01

The Problem.

AI is one of the most powerful tools humanity has built. But its physical infrastructure is growing unchecked — impacting electricity grids, water systems, and communities. This isn't new, but the scale is unprecedented.

Data centers have always used energy. AI broke the efficiency curve.

2000-2005

+90% growth

Internet boom. Data centers expanded rapidly. First wave of energy concern.

2010-2018

~Flat

Efficiency gains matched demand growth. Cloud consolidation, better chips, virtualization. The industry solved its energy problem — temporarily.

2020-2026

+12%/year

AI broke the efficiency curve. GPU-intensive training overwhelms the efficiency gains that kept demand flat for a decade.

The pattern: Data center energy was a concern in 2005, was managed through efficiency in the 2010s, and is now growing faster than efficiency can compensate — driven by AI training and inference at unprecedented scale. Source: Lawrence Berkeley National Lab, IEA Energy and AI 2026

The Grid & Your Electricity Bill

Data centers projected to consume 6.7-12% of total US electricity by 2028.

Virginia: Bills up $11-16/month. PJM auction up 833%. 78% of voters blame data centers.

Ohio: Bills up ~$16/month. 12% above national average.

National: 21M households behind on utility bills. Data centers = 63% ($9.3B) of PJM capacity bill.

Sources: CNBC, Brookings, EESI, Virginia SB 253

Carbon Backsliding

Tech pledged Net Zero. AI broke those pledges.

Google +48%

Microsoft +23%

Meta +60%

Guardian analysis: actual emissions 7.6x higher than reported (2020-2022). Morgan Stanley: 2.5B tons CO2 by 2030.

Sources: Sierra Club, TechInformed, Morgan Stanley

Community Impact

Land: Hundreds of acres consumed, often zoned for agriculture/housing.

Noise: Industrial cooling systems affecting nearby residential areas.

Grid: Competing with hospitals, schools, homes for electricity. Who gets priority?

Jobs: A facility using as much power as a small city may employ 50-100 people.

Sources: Lincoln Institute, Consumer Reports

Water Consumption

AI chips generate heat. Cooling them requires water. But the real concern isn't "per query" — it's where and when that water is consumed.

The headline numbers

Per query: ~500ml per 20-50 ChatGPT prompts (UC Riverside estimate, methodology contested). More recent data: Google AI search ~10ml/query, Mistral ~3.5ml/response.

Google alone: 6.4 billion gallons in 2023. Council Bluffs, Iowa peaked at 2.7M gallons/day in summer 2024.

Projected US total: Could double or quadruple by 2028 to 150-280 billion liters/year.

The "Heat Trap" — local and seasonal stress

Data centers use water when and where it's most scarce. Cooling demand spikes on the hottest days — exactly when farmers and residents need water most. In Arizona, Iowa, and Texas, this creates direct resource competition.

Policy should focus on "Peak Daily Withdrawal Limits" during droughts and heat waves, not just total annual gallons.

Not all cooling is "thirsty"

The "water bottle" headline applies to evaporative cooling — older tech that's cheap but consumes water by evaporating it. Like a giant swamp cooler.

Evaporative (water-intensive)

Cheap, evaporates water. Source of "water bottle" headlines. Common in older facilities.

Closed-loop / Dry cooling

Like a car radiator. Near-zero water. More electricity for fans. The solution for water-stressed regions.

The "Indirect Ghost" — 80% of AI's water footprint

Most focus on water at the data center. But the majority happens at the power plant. Coal and gas plants consume water to generate electricity.

Solar interconnectivity solves both. Solar and wind use virtually zero water. Moving AI to solar grids addresses water AND carbon simultaneously.

"The 'one water bottle per chat' headline is a symptom of outdated cooling tech. We are using 20th-century evaporation to cool 21st-century chips. Mandate transparency, incentivize dry-cooling + solar — grow scientific capacity without draining water tables."

Sources: Undark, Brookings, EESI, ScienceDirect

The Optimization Trajectory

We've seen this before. Every transformative technology goes through the same cycle:

Phase 1

Wildly Inefficient

Early cars: 8 MPG. No emissions controls. Leaded gasoline. Nobody measured the impact because the technology was too exciting.

Phase 2

Measurement

CAFE standards (1975). EPA fuel economy labels. Catalytic converters. You can't optimize what you can't measure. Regulation created visibility.

Phase 3

Optimization

Fuel injection. Aerodynamics. Weight reduction. Once measured, the industry optimized — 8 MPG became 30 MPG without sacrificing performance.

Phase 4

Transformation

Hybrids. EVs. Regenerative braking. The technology that caused the problem became the solution — when policy pushed it there.

AI compute is in Phase 1. Nobody measures fleet utilization. Nobody reports per-job energy. Nobody knows if a GPU is right-sized to its workload.

We need to move to Phase 2 — measurement and transparency — before we can get to optimization. That's what our H200 data demonstrates: the measurement tools exist, the waste is real, and the path to efficiency starts with visibility.

Section 02

The Climate Cost.

We cannot fix what we do not measure. The energy cost of unoptimized AI compute is invisible across the industry.

Our Measured Cost

15.4 kWh

Total across 6 tracked experiments (~13 hours of GPU time)

5.7 kg CO2

Carbon emissions at US average grid intensity

In everyday terms

Running a microwave for 9 hours
Half a US home's daily electricity
Driving 14 miles

Extrapolated: All AI Research

11.7 GWh

Annual energy wasted on idle GPUs across 100K researchers

Conservative estimate based on measured primary data

1,100

American homes powered for a year — on idle GPUs

288 kWh

One hyperparameter sweep = driving 265 miles

Scale	Energy	CO2	Equivalent
Our clinical research (6 runs)	15.4 kWh	5.7 kg	Half a day of home electricity
Full hyperparameter sweep	288 kWh	106 kg	Driving 265 miles
GPT-4 training (estimated)	50,000,000 kWh	~20M kg	Powering 4,500 homes for a year

Energy Scale: From Our Lab to Foundation Models

Logarithmic scale. Our data is measured. Industry estimates from published reports and researcher analysis.

Our experiment (1 run)

5.3 kWh

Measured

Full sweep (108 configs)

288 kWh

Estimated

Llama 3.1 70B (Meta)

~2,500,000 kWh

Estimated

Gemini Ultra (Google TPU)

~10,000,000 kWh

Estimated

Claude Opus (Anthropic)

~5-15M kWh

Undisclosed

GPT-4 (OpenAI)

~50,000,000 kWh

Estimated

GPT-5 (OpenAI, projected)

~100M+ kWh

Projected

Our measured data (codecarbon + pynvml)

Our extrapolation

Industry estimates

Foundation model estimates

Sources:

• Luccioni, S., Jernite, Y., & Strubell, E. "Power Hungry Processing: Watts Driving the Cost of AI Deployment." ACM FAccT 2024

• de Vries, A. "The Growing Energy Footprint of Artificial Intelligence." Joule, 2023

• Stanford HAI. "AI Index Report 2025 — Training Compute & Carbon Emissions." hai.stanford.edu — GPT-3: 588t CO2, GPT-4: 5,184t, Llama 3.1 405B: 8,930t

• IEA. "Electricity 2026 & Energy and AI." iea.org — Data centres: 415 TWh (2024), projected 945 TWh by 2030

• Epoch AI. "How Much Energy Does ChatGPT Use?" epoch.ai

Note: Foundation model estimates vary widely. Google TPU energy is per-chip more efficient than NVIDIA GPUs but total training uses thousands of chips. Anthropic does not disclose training compute. Bar widths are approximate log-scale representations.

Section 03

The H200 Data.

We ran real clinical AI research on an NVIDIA H200 — the most powerful GPU in production — and tracked every watt, every byte, every cycle.

Power

Provisioned (TDP)

700 W

Actually used

93 W

87%

idle capacity

Memory

Provisioned (HBM3e)

150 GB

Actually used

1.7 GB

99%

unused

Compute

Available

100%

Actually utilized

11%

89%

idle cores

What we ran

Real clinical AI research — mortality prediction on 46,520 hospital patients
10+ experiments across multiple model configurations
NVIDIA H200 NVL, 150 GB HBM3e, CUDA 12.6
MIMIC-III dataset (public, Beth Israel Deaconess Medical Center)
Shared academic compute cluster (UIUC) — other researchers use the same hardware

How we measured

codecarbon — energy consumption and CO2 emissions (actual sensor readings)
pynvml — GPU power draw, utilization, memory (direct NVIDIA driver calls)
Not estimated. Not theoretical. Primary measurements from the chip.

10+ Measured Experiments on NVIDIA H200

Every experiment tracked with real instrumentation. Not estimated. Not theoretical.

Run	Time	Energy	CO2	GPU %	Memory
LMM standalone	104 min	2.68 kWh	0.99 kg	11%	1.72 / 150 GB
GRASP+LMM	103 min	2.66 kWh	0.98 kg	11%	1.73 / 150 GB
GRASP+LMM+codemap (lr=5e-4)	177 min	4.60 kWh	1.69 kg	14%	2.58 / 150 GB
GRASP+LMM+codemap (lr=1e-4)	181 min	4.67 kWh	1.72 kg	12%	2.58 / 150 GB
GRASP+LMM+codemap (3 fixes)	161 min	0.61 kWh	0.22 kg	14%	1.67 / 150 GB
LMM standalone + batch=256	42 min	0.17 kWh	0.06 kg	16%	3.39 / 150 GB

One parameter change. Everything shifted.

The last row is the same model, same data, same architecture — with one parameter changed: batch_size=32 → 256. GPU utilization jumped from 11% to 16%. Memory usage doubled from 1.7 GB to 3.4 GB. Training time dropped from 104 minutes to 42 minutes. Energy fell from 2.68 kWh to 0.17 kWh — a 15x reduction.

Utilization isn't a fixed property of the hardware — it's a function of how the workload is configured. With larger datasets (MIMIC-IV, 200K+ patients), time series data, and full hyperparameter sweeps, utilization climbs further. This table captures early-stage research prototyping, not peak capacity.

Methodology

codecarbon (energy/CO2 from actual sensor readings) + pynvml (GPU metrics from NVIDIA driver)

Hardware

NVIDIA H200 NVL, 150 GB HBM3e, CUDA 12.6

Dataset

MIMIC-III (46,520 patients, public clinical data, Beth Israel Deaconess)

Reproducibility

All code open source via PyHealth. Any researcher with GPU access can replicate.

What 87% idle capacity actually means

What's NOT the problem

The chip drawing 93W instead of 700W is good engineering. NVIDIA designed the H200 to scale its power draw to the workload. When the model needs less compute, the chip draws less power. It doesn't waste 607W heating empty silicon. That's efficient hardware design working as intended.

This is a shared academic cluster — when our job finishes, other researchers can use the same GPU. The chip isn't sitting idle between jobs in the same way a dedicated enterprise allocation would. We've generally been among the most active users, but we have no way to confirm fleet-wide utilization because that data isn't collected.

What IS the problem

The infrastructure provisioning. The datacenter built cooling, power delivery, and physical space rated for 700W per chip. Whether our job draws 93W or 700W, that cooling infrastructure exists, consumes baseline power, and occupies space that could serve smaller, more efficient hardware.

The allocation mismatch. Our model used 1.7 GB of 150 GB memory. We could have selected a smaller chip — an A100 (80 GB, ~400W TDP) — and achieved the same scientific result. The cluster offers H200s because they handle everything, but "handles everything" means most research jobs are over-provisioned by default.

The measurement gap. Cluster administrators likely have access to fleet utilization data through scheduling and monitoring tools. But that data isn't published to users or reported publicly. As a researcher, I can measure my own job's utilization — I cannot see the fleet average. If even academic clusters with transparent governance don't surface aggregate utilization to their users, enterprise datacenters with commercial incentives to obscure waste certainly won't without a reporting mandate.

The right-sizing opportunity

On this cluster, researchers could select different hardware tiers. Here's what our workload actually needed vs what was allocated:

Hardware	Memory	TDP	Approx. cost	Runs our model?	Right-sized?
NVIDIA H200 NVL (what we used)	150 GB	700 W	~$40,000	Yes	No — 88x memory over-provisioned
NVIDIA A100 (available on cluster)	80 GB	400 W	~$15,000	Yes	Closer — still 47x over
NVIDIA RTX 4090	24 GB	450 W	~$1,600	Yes	Better — 14x over
NVIDIA RTX 4060 (right-sized)	8 GB	115 W	~$300	Yes	Yes — 4.7x headroom

Our model used 1.7 GB peak memory. An 8 GB consumer GPU with 4.7x headroom runs the same experiment at ~6x lower power provisioning and ~130x lower hardware cost. The science is identical. The energy footprint is dramatically different.

Note: Training speed would be ~1.5-2x slower on the RTX 4060 due to lower memory bandwidth. For a 42-minute experiment, that means ~60-80 minutes. Acceptable for research iteration, not for production inference.

Section 04

Precision Engineering.

We didn't go to the moon just to walk on rocks. We went to prove we could master the physics of the impossible. Along the way we invented the cordless drill, the integrated circuit, and the water purifier. The waste isn't the electricity that wasn't used — it's the infrastructure ghost built around it.

The Infrastructure Ghost

The problem isn't the chip. The H200 scaling down to 93W is good engineering — NVIDIA designed it to draw only what the workload needs. That headroom gives researchers the freedom to scale. If the H200 gets you a mortality prediction in 42 minutes instead of 3 hours, that's the hard thing we want to keep doing.

The problem is the ghost target. When a utility planner in Northern Virginia or Ohio sees a new data center request, they look at the nameplate capacity. 10,000 H200s installed? The grid reserves 7 megawatts of capacity. But if those chips are doing research or inference or cloud workloads, scaling to 93W like our case, the actual load is 0.93 megawatts.

7 MW

Grid capacity reserved
(based on nameplate TDP)

0.93 MW

Actual load
(measured, our workload)

6.1 MW

Ghost power
(doesn't exist, but a gas plant is built for it)

The utility builds a gas plant and raises residential electric rates for 6.1 megawatts of power that doesn't exist. There is a massive difference between a local circuit being ready for a surge and a national grid building 30-year gas plants for a surge that our data shows happens a fraction of the time.

This isn't hypothetical. PJM — the grid operator for Virginia, Ohio, and 11 other states — received 166 GW of forecast peak load growth, roughly 90 GW from data centers. Industry analysts say actual need is closer to 65 GW. Forecasts are overstating demand by ~38%. The "ask-versus-accepted" gap is 43% before forecasts are even published. By summer 2026, PJM will have "just enough power to keep the grid reliable" — partly because they're planning for phantom load that may never materialize.

Sources: Modo Energy (PJM forecast), Utility Dive, NRDC, WRI

Speed Saves Lives. Don't Slow Down Science.

If we just tell people to "use a smaller chip," we ignore the fact that speed saves lives — especially in clinical research. The H200 got us a mortality prediction in 42 minutes. On a consumer GPU, that's 80 minutes. In a hospital, that time difference matters.

The goal isn't to restrict compute. It's to measure the efficiency so we can pack more science into the same grid — rather than building new gas plants for "peaks" that rarely happen. We're too smart to waste the energy we're fighting for in the Strait of Hormuz.

The Real Research Workflow

11%

Testing phase

Exploring new architectures. This is fine — it's discovery.

30-50%

Validation phase

Full datasets, larger batches. Utilization climbs naturally.

40-70%+

Production / sweeps

Full hyperparameter sweeps, MIMIC-IV. The H200 earns its keep.

Some companies ARE 100% utilized — they are the rationale for the Stargate project in Argentina. But most enterprise AI at this level doesn't need 100% of an H200's TDP. There are moments it does. The utility companies treat the 93W research run and OpenAI's 700W training run as the same 700W ghost target on the grid.

If Chips Can Scale, The Grid Can Too.

The H200 scales from 93W to 700W dynamically. That's precision engineering. If we focus on infrastructure awareness — showing that chips can scale down — we prove that solar interconnectivity is possible. If chips can match the solar cycle, the grid can too.

What utilities see today

Nameplate capacity. 10,000 H200s = 7 MW reserved. They build gas plants for the peak. Raise residential rates to fund it. The "ghost target" becomes a 30-year fossil fuel commitment.

What transparency would reveal

Actual load: 0.93 MW. Peak surges happen but they're brief and predictable. We don't need a gas plant for the surge — we need batteries and solar interconnects to buffer it. Move the training surge to daytime solar. Run inference work in the background.

The Visibility Gap: Hoarding by Omission

Data centers are lying by omission. They know that while some chips are surging, 80% of the fleet could be idling. They're hoarding grid capacity like a landlord hoards empty apartments — keeping the prices high and carbon-heavy while the rest of us just want to get the work done.

The policy demand

Data centers should be legally required to report actual peak vs provisioned load. In March 2026, the Strait of Hormuz blockade makes every watt a strategic asset. We can't let ghost targets hoard energy that hospitals and homes need.

The vampire surcharge

Residential bills should not subsidize ghost infrastructure. If a data center's provisioned load is 10x higher than their actual usage, they should pay a surcharge. That money funds the solar interconnects and batteries needed to bridge the energy gap — so the local community isn't paying for industry inefficiency.

Sources: Utility Dive (off-grid risks), ENR (large load rules), S&P Global (22% demand rise)

DeepSeek: Proof That Efficiency Is Power

DeepSeek isn't just "Chinese ChatGPT." It's a masterclass in algorithmic efficiency. Their Mixture of Experts architecture only wakes up about 5% of its parameters for any given query. It does world-class science while drawing a fraction of the power of a brute-force model like GPT-4. They don't need a 24/7 gas baseline because their models are smart enough to scale with the solar cycle.

Meanwhile, in the US

Meta is building 7 new gas plants in Louisiana just to keep its ghost targets from crashing the local grid. Brute force. Fossil lock-in. Residential bills up $16/month in Virginia.

Meanwhile, in China

Under the 15th Five-Year Plan, China moved its intelligent computing hubs directly next to the largest wind and solar farms in Inner Mongolia and Gansu. 15+ UHV lines pipe green energy directly into AI. By 2026, China's new computing hubs target 80% green electricity. They didn't just build data centers — they redrew the map.

Reclaiming the ghost gap is how we fund the next generation of American solar and nuclear without slowing down the science. Transparency forces the hyperscalers — Google, Meta, Microsoft — to admit they don't need the Louisiana gas plants if they just optimized their fleet. If we had transparency, we'd see we don't need a new gas plant for the surge. We need batteries and solar interconnects to buffer it. We need the precision to move the training surge to daytime and the inference work to the background. We have the capacity to do hard things — but we're too smart to let them use the surge as a shield for fossil fuel lock-in.

Section 05

The Climate Compute Card.

AI models have Model Cards. Compute should have Climate Compute Cards.

The Precedent Chain

Every transformative technology eventually gets a disclosure standard. AI compute is overdue.

1975

CAFE Standards

Cars got fuel economy labels. Manufacturers had to disclose MPG. Efficiency went from 8 to 30 MPG.

1992

Energy Star

Appliances got efficiency ratings. Consumers could compare. Market rewarded efficiency.

2019

AI Model Cards

Mitchell et al. proposed standardized documentation for AI models — bias, training data, intended use, limitations.

2026

Climate Compute Card

Every AI workload discloses its energy footprint, ghost ratio, and grid impact. The missing standard.

Source: Mitchell et al. "Model Cards for Model Reporting." FAT* 2019

Model Cards exist because releasing AI without documentation is irresponsible. The same is true for energy.

AI Model Card discloses	Climate Compute Card discloses
What the model was trained on	What hardware it ran on
Known biases and limitations	GPU utilization and memory usage
Intended use cases	Energy consumed (kWh)
Performance metrics	CO2 emitted (kg)
Who built it and when	Grid carbon intensity at time of run
Ethical considerations	Ghost ratio: TDP vs actual power draw

Climate Compute Card

Model GRASP+LMM (clinical AI)

Hardware NVIDIA H200 NVL

TDP (nameplate) 700 W

Actual avg power draw 93 W

GPU utilization

11-16%

Memory used 1.7-3.4 / 150 GB

Energy consumed 0.17-4.67 kWh

CO2 emitted 0.06-1.72 kg

Grid carbon intensity 368 gCO2/kWh (US avg)

Time of run Mixed (day/night)

Energy per epoch 0.003-0.093 kWh

Ghost Ratio 7.5x

Ghost Ratio = TDP / actual power draw. Higher = more infrastructure built for capacity not used.

Measured with codecarbon + pynvml | Primary data, not estimates

The Ghost Ratio is the new metric

"Right-sized?" is the wrong question. The H200 scaling to 93W IS right-sizing — the chip does it automatically. The problem isn't the hardware choice. It's the infrastructure built around the nameplate TDP that the chip rarely hits. Ghost Ratio exposes that gap. 7.5x means the grid provisioned 7.5x more capacity than was actually used.

This is already a known problem at the grid level. PJM received 60 GW of data center capacity requests but accepted only 34 GW — a 43% cut for "phantom load." Dominion Virginia reports an 82% load factor for data centers, but some submissions assume 100%. The Ghost Ratio brings this visibility down to the chip level, where the gap is even larger.

Sources: Utility Dive (PJM), WRI, Grid Strategies

The tools already exist

codecarbon and pynvml are open source, free, and work on any GPU. We built this into our research pipeline in a day. The barrier isn't technical — nobody requires it.

Cards protect responsible researchers

A researcher at 11% during testing and a hyperscaler permanently at 11% across 10,000 GPUs are very different stories. The CCC provides context — phase of work, time of run, grid source. It rewards efficiency without punishing exploration.

Cards expose ghost infrastructure

Fleet-wide Ghost Ratios become visible and actionable. If Google's average Ghost Ratio is 5x across 100,000 GPUs, that's the data a utility planner needs to stop building gas plants for peak loads that never happen.

The precedent chain

CAFE standards for cars (1975). Energy Star for appliances (1992). LEED for buildings (1998). Model Cards for AI models (2019). Climate Compute Cards for AI infrastructure (2026).

Section 06

How Data Centers Are Powered.

The demand is outrunning clean energy supply. The gap is being filled with natural gas, and Big Tech is moving to control energy production itself.

The current energy mix powering AI

40%+

Natural gas

Largest single source powering US data centers in 2024. Growing as AI demand outpaces renewable buildout.

30%

Coal

Still a major source globally. US grid mix includes coal in PJM region (Virginia, Ohio) where data centers concentrate.

~30%

Renewables + Nuclear

Solar, wind, hydro, existing nuclear. Growing but not fast enough to match AI demand growth of 12%/year.

Despite Net Zero pledges, the AI boom is locking in decades of new fossil fuel infrastructure. Demand for gas turbines has pushed delivery timelines out to 2028+.

Sources: Fortune, Grist

The fossil fuel lock-in

Meta is paying for 7 new natural gas plants (5.2 GW total) to power its Louisiana data center — the largest single data center facility planned in the US.

Across PJM: Almost all new gas-fired plants secured in fast-track procurement won't come online until 2030 or later. Once built, they operate for 30-40 years — locking in fossil fuel dependency through the 2060s.

Data center developers have asked the Trump administration for exemptions from pollution rules for new gas plants, arguing AI demand is a national security priority.

The renewable rollback compounds the problem.

US solar installations fell 14% in 2025 after Trump rollbacks hit the sector

Federal solar tax credit (25D) eliminated December 31, 2025 via the "One Big Beautiful Bill"

Interior Department paused offshore wind projects under construction

New "project density" rules disqualify many solar/wind projects from federal land permits

US pulled from international renewable energy and climate organizations (Jan 2026)

AI demand is growing at 12%/year. Renewable capacity is contracting. The gap is filled with gas.

Sources: Insurance Journal (Meta gas plants), Grist (pollution exemptions), Semafor (solar -14%), Third Way (rollback timeline), SEIA (OBBB)

Big Tech is buying its own power plants

Unable to secure enough clean energy from the grid, tech companies are moving to control energy production directly — primarily through nuclear.

Company	Nuclear deal	Scale	Timeline
Microsoft	Restart Three Mile Island with Constellation Energy	835 MW, $16B (20-year PPA)	Target 2028
Google	Backing Kairos Power SMRs	500 MW development agreement	Hermes 2: 2030
Amazon	$500M investment in X-energy SMRs + Susquehanna campus	$20B+ total AI campus	2030+
Meta	RFP for new nuclear generation	1-4 GW requested	TBD
Combined: Big Tech signed 10GW+ new US nuclear in the past year		SMRs are 5+ years from commercial operation

The case FOR tech-owned nuclear

Zero-carbon, always-on baseload power
Decouples AI from the public grid — no residential bill impact
Private capital builds infrastructure faster than public utilities
Nuclear + AI could accelerate energy transition if done right

The case for concern

Vertical integration of energy: Tech companies that control both AI AND its power supply have unprecedented leverage over critical infrastructure
Public accountability: Private reactors face different regulatory scrutiny than public utility nuclear — who oversees safety and waste?
Grid equity: If Big Tech builds private power, the public grid loses both demand (revenue) and investment — potentially degrading service for everyone else
5-10 year gap: SMRs aren't ready until 2030+. The gap is filled with gas. Every gas plant built today operates for 30-40 years.

Sources: IEEE Spectrum, Introl, Marketplace, IAEA

US Data Center Concentration & Expansion

4,088 facilities across 26 tracked states. Solid = existing, dashed = planned. Data from Visual Capitalist, Axios, DataCenterMap (2025-2026).

4,088

Total US data centers

1,500+

In PJM grid region alone

+176%

Georgia planned growth

6,300 MW

Planned for Loudoun County, VA alone

Sources: Visual Capitalist, DataCenterMap.com (4,088 facilities), Axios

Section 07

The Geopolitical Energy Chain.

AI compute policy is energy policy is foreign policy. The demand signal from data centers flows through a chain that ends at shipping lanes and military operations.

The Chain

AI compute policy is energy policy is foreign policy. The demand signal from data centers flows through a chain that ends at shipping lanes and military operations.

Step 1

AI demand grows 12%/yr

→

Step 2

Renewables rolled back

Solar -14%, tax credits cut

→

Step 3

Gap filled with gas

Meta: 7 new gas plants

→

Step 4

US LNG exports boom

15 Bcf/day, doubling by 2029

Step 5

Military secures supply routes

Strait of Hormuz, Feb 2026

→

Step 6

Dependent nations buy US gas

Japan: 95% ME oil disrupted

→

Step 7

Gas revenue funds more production

"Drill Baby Drill"

→

Cycle repeats

Gas plants lock in 30-40 years

Renewables lose investment

Part 2: The US as Global Gas Superpower

"Drill Baby Drill" isn't just a slogan — it's sustained by two converging demand signals.

LNG export boom

US LNG exports (2025) 15.0 Bcf/day

Projected (2027) 18.1 Bcf/day

Capacity expansion by 2029 +13.9 Bcf/day (doubling)

Production (2026 projected) 118 Bcf/day

New terminals: Corpus Christi expansion (Mar 2025), Plaquemines (Dec 2024), Golden Pass (early 2026).

Two demand drivers sustain expansion

Domestic: AI data centers

Growing 12%/year. Fastest new source of US electricity demand. Natural gas fills 40%+ of data center power.

Export: LNG to Europe, Asia

Growing 15-19%/year. US is world's largest exporter. Hormuz crisis accelerates demand from Japan, Korea, India.

Both create political support for continued drilling. Both require investment in gas infrastructure. Both lock in fossil dependence for decades.

The connection to AI policy: If AI data centers shift to renewables, the domestic demand signal for gas weakens. This doesn't eliminate LNG exports, but it removes the fastest-growing justification for new gas infrastructure. Policy that makes AI compute more efficient is policy that reduces the demand signal for fossil expansion.

Sources: EIA, Natural Gas Intel, Wolf Street

Part 3: The Strait of Hormuz Crisis (Feb 2026)

US and Israel conducted strikes on Iran (Feb 28, 2026). Iran closed the Strait of Hormuz to commercial shipping.

Global oil/gas through Hormuz20%

Brent crude price spike+10-13%

Analyst warning (JPM/Barclays)$100-130/bbl

VLCC day rate (record)$423,736 (+94%)

Container rates+150%

Tankers rerouting via Cape of Good Hope: +10-14 days per voyage. War-risk insurance: +$250K per transit.

Sources: Wikipedia, Al Jazeera, CNBC, Bloomberg

Japan: The Most Vulnerable Link

Japan imports 85%+ of all energy. The Hormuz closure directly threatens national survival.

Oil from Middle East95%

Oil via Strait of Hormuz93%

From UAE alone43%

From Saudi Arabia39%

Strategic reserves~200 days

Japan is pushed toward US LNG — which is available, at US-set prices. Energy vulnerability is geopolitical leverage.

Sources: S&P Global, Energy Tracker Asia, CSIS

The Cape of Good Hope Reroute

With the Strait of Hormuz closed and Houthi attacks resuming in the Red Sea (announced Feb 28), tankers must reroute around Africa:

Normal route

Persian Gulf → Strait of Hormuz → Indian Ocean → destination

Rerouted

Persian Gulf → around Africa → Cape of Good Hope → Atlantic → destination

+10-14 extra days per voyage

Longer route = more fuel burned per tanker, slower delivery cadence, fewer trips per year per ship. The same fleet delivers less oil at higher cost — and burns more fossil fuel doing it. The reroute itself increases global emissions while reducing supply.

Sources: Middle East Insider, Maritime News

Japan's Four Options — All Favor US Energy Dominance

A

Accept higher costs

Energy costs passed to consumers and industry. Economic contraction.

B — Most likely

Buy US LNG

Available immediately. At US-set prices. Deepens energy dependence on the US.

C

Restart nuclear

Politically difficult post-Fukushima. Years to restart safely.

D

Accelerate renewables

Right answer long-term. Not fast enough for immediate crisis.

The geopolitical leverage: A country that imports 95% of its oil from a region controlled by US military power is a country that follows US policy preferences. Japan's energy vulnerability is, from a realpolitik perspective, a feature not a bug of the current system. The Hormuz closure pushes Japan (and South Korea, Taiwan, India) toward US LNG exports — which are growing at record pace. The military action that disrupted traditional supply routes simultaneously creates demand for American gas.

The caveat

AI compute efficiency alone doesn't end fossil fuel dependence. LNG exports, transportation, heating, and industrial demand continue regardless. But AI is the marginal demand driver — the fastest-growing new source of electricity consumption. Removing it from the fossil equation changes the marginal economics of new gas plants.

A gas plant that's justified by AI demand + LNG exports + residential growth might not be justified by LNG exports + residential growth alone. The AI demand is the tipping point in many capacity decisions. Policy that addresses it is policy that shifts those decisions.

Where AI Compute Policy Breaks the Cycle

The chain starts with demand. If AI data centers operate more efficiently, the demand signal for new gas infrastructure weakens:

Right-sized compute → less total electricity needed → fewer gas plants justified

Temporal flexibility → training shifts to solar/wind peaks → gas peaker plants unnecessary

Behind-the-meter solar → data centers off-grid → removed from fossil-dependent infrastructure

Transparency mandates → waste becomes visible → political cover for "we need more gas" erodes

Science protection → research continues → efficiency innovation compounds

Each intervention reduces AI's demand for fossil electricity. Less demand → less justification for new gas plants → less pressure to control supply routes.

"AI is a tool worth powering. But powering it with fossil fuels locks in 30-40 years of carbon infrastructure, requires military control of energy supply routes, and concentrates energy production in the hands of the same companies that control the AI. The alternative — renewables, right-sized compute, temporal flexibility — breaks the cycle. It's not anti-AI. It's anti-lock-in."

The China Benchmark

Not a threat narrative — a competitive reality check.

China isn't just outpacing the U.S. — they're playing a different game. While the U.S. spent 2025-2026 navigating policy rollbacks, China leveraged a state-led "Dual-Track" strategy and hit their climate targets six years ahead of schedule.

The Numbers: A Massive Lead

1,400 GW

Wind + Solar capacity (end 2025)

Surpassed their 2030 target of 1,200 GW in mid-2024 — six years early

317 GW

Solar installed in 2025 alone

US expected: ~44 GW in 2026. China deploys 7-8x more solar annually

570 GW

Wind capacity

More than half of global total. US: ~150 GW

"Eastern Data, Western Computing" (东数西算)

While the US worries about data centers in Virginia raising local bills, China is executing a national strategy to solve the exact same problem.

The strategy

Move energy-hungry AI clusters from power-constrained coastal cities (Shanghai, Shenzhen) to resource-rich western regions (Inner Mongolia, Gansu, Ningxia).

Surplus power

Projected 400 GW of spare renewable capacity in the west by 2030. Data centers built directly next to solar/wind farms — the "Solar Interconnects" this report advocates for.

The interconnect

15 new Ultra-High Voltage (UHV) transmission lines between 2026-2030. One Tibet-to-south project delivers 43 billion kWh/year of green power to megacities.

The Coal Paradox

China is not purely green. They're running a "Dual-Track" system: projected 1,333 GW of coal by end of 2026. But unlike the US where coal is often primary fuel, China increasingly uses coal as a strategic reserve to balance the intermittency of their massive wind and solar fleets. Coal as backup, not as base load.

USA vs China: The Execution Gap

Factor	United States (2026)	China (2026)
Policy	Uncertainty: OBBB rollbacks, solar tax credits eliminated	Centrally mandated 15th Five-Year Plan, $580B grid investment
Manufacturing	Reshoring slow, struggling to build domestic solar cells	Absolute dominance, price war crashing global panel costs
Permitting	3-5 year interconnection queues for new projects	Fast-tracked UHV "Green Lanes" for state-linked projects
Solar deployed (2025)	~44 GW	317 GW (7-8x more)
AI compute strategy	Data centers wherever land is cheap, powered by local grid (gas)	National plan: move compute to renewable surplus regions

Why This Matters for US Policy

China's progress isn't a threat narrative — it's a competitive benchmark. They figured out that to win at AI, you have to win at green energy interconnectivity first.

If the US doesn't adopt mandatory transparency and right-sized compute, we will lose the AI race not because our models are worse — but because our grid is too expensive and inefficient to power them competitively.

China solved the "Resource Squeeze" by geographically decoupling AI compute from coastal population hubs. The US can do the same — but it requires policy that the current administration is actively dismantling.

Section 08

The Silicon Stranglehold.

If the Strait of Hormuz is the energy chokepoint, Taiwan is the compute chokepoint. In 2026, these are no longer separate issues — they are the same single point of failure.

Energy Chokepoint: Strait of Hormuz

93%

of Japan's oil imports pass through here. Closed by Iran, Feb 28, 2026.

Controls: oil, LNG, helium (critical for chip manufacturing)

Impact: energy prices, shipping costs, grid stability across Asia

Duration: indefinite — Iran controls the coastline

Compute Chokepoint: Taiwan Strait

100%

of high-end AI chips (2nm-5nm) are manufactured by TSMC in Taiwan.

Controls: every H200, B200, and next-gen AI accelerator

Impact: if TSMC stops, the global AI supply chain ceases to exist

2nm capacity: 100% pre-booked through 2028 (Apple, NVIDIA, OpenAI)

Taiwan's 11-Day Clock

The most immediate Taiwan threat in March 2026 isn't a missile — it's a blackout.

11 days

Taiwan's LNG reserves as of March 2026

35%

of Taiwan's gas from Qatar/UAE — cut off by Hormuz closure

0

Nuclear units operating — last shut down mid-2025

100%

of 2nm chip production capacity at risk

"We are betting the entire AI revolution on an island that is 11 days away from energy collapse if the Middle East stays closed. This isn't just a military risk — it's an infrastructure oversight."

Sources: CommonWealth Magazine (Taiwan gas crisis), Atlantic Council (energy resilience), LiveUAMap/Politico (11-day figure), Domino Theory (energy dependence)

China's Taiwan Threat: Not Just Energy — Territorial

The energy blackout scenario is the immediate risk. But the deeper threat is China's long-standing territorial claim over Taiwan — and 2026 conditions that make action more likely than any time since 1996.

Military posture

China has conducted record military exercises around Taiwan in 2025-2026. PLA Navy now has more warships than the US Navy. Amphibious landing capability has tripled since 2020. The 2026 National Defense Strategy pivoted toward "denial-based defense" of the First Island Chain — signaling the US is preparing for a world where Taiwan can't be defended conventionally.

The Hormuz precedent

The US/Israel strikes on Iran and Hormuz closure have shown China what a chokepoint disruption looks like — and that the global response is economic sanctions, not military intervention. If the US won't risk war over Hormuz oil, will it risk war over Taiwan chips? Beijing is watching the answer in real time.

Sources: The War Zone (blockade drills), Defense News, The Diplomat, Global Taiwan Institute (drone warfare), Brookings (gray zone), AEI (March 2026 update)

The Erosion of the "Silicon Shield"

Taiwan's chip dominance was their "Silicon Shield" — China wouldn't invade because they needed the chips too. In 2026, that shield is thinning from both sides.

Thinning from the US side

US reshoring push: Jan 2026 MOU pushes Taiwan firms to move 40% of supply chain to the States by 2029. CHIPS Act funding accelerates domestic fab construction.

The signal to Beijing: If the US is building its own fabs, it's preparing for a world where Taiwan is no longer the sole source. That makes Taiwan less valuable to protect.

Thinning from the China side

China's domestic chip push: SMIC producing 7nm chips (sanctions workaround). Huawei's Ascend AI accelerators reducing dependence on TSMC for some workloads.

The closing window: If both the US and China reduce TSMC dependence, Taiwan loses its shield. China may feel pressured to act before 2028 — while the world is still too dependent to risk a full military response.

Gray zone rehearsals

March 2026: intensified drone incursions and transponder-spoofing. Not provocations — rehearsals for a selective blockade that could filter energy shipments while letting other trade pass. Economic coercion driving up insurance rates and shipping costs for Taiwan.

Blockade scenario

China doesn't need to invade. A naval blockade cutting LNG shipments to Taiwan for 11 days collapses their grid. TSMC goes offline. The global AI supply chain stops. China wins without firing a shot at the fabs themselves.

The insurance math

Even without a blockade, constant gray zone activity raises shipping insurance premiums for Taiwan-bound vessels. This makes Taiwan's chip exports less competitive vs reshored alternatives — achieving economic coercion through risk pricing alone.

Sources: CNBC (US-Taiwan deal), Stimson Center (Silicon Shield erosion), CSIS (Taiwan importance), Wisconsin SoB (CHIPS Act cost)

The Physical Link: Energy → Helium → Chips

The Hormuz closure doesn't just affect oil. It's halted helium shipments from Qatar — one of the world's largest producers. Helium is essential for cooling the lasers that etch AI chips.

Samsung and SK Hynix in South Korea are already rationing helium.

This has triggered a "Memory Supercycle" — High-Bandwidth Memory (HBM), the brain of the H200, is becoming physically scarce.

The HBM3e in the H200 we tested (150 GB) requires helium-cooled manufacturing. No helium → no HBM → no H200s → no frontier AI training.

The 2nm bottleneck compounds the problem.

TSMC's 2nm chips (mass production late 2026) promise 25-30% power reduction. But 100% of capacity is pre-booked through 2028. Researchers are stuck on older, less efficient nodes — throwing more "brute force" power (natural gas) at older hardware.

This is what's driving the carbon backsliding. We can't get efficient chips, so we burn more gas to compensate with less efficient hardware.

From Chokepoints to Resilience: Policy Solutions

Threat	Innovation solution	Climate solution
Grid fragility (Taiwan 11-day reserves)	Right-sized compute: reduce base load so TSMC can stay online longer on limited reserves	Solar interconnectivity: push for off-grid AI clusters not dependent on imported LNG
Supply chain concentration (100% TSMC)	Geographic decentralization: distributed clusters reduce need for single-source 2nm chips	Hardware longevity: incentivize legacy chips (8nm-12nm) manufacturable in US/Europe today
Blockade risk (gray zone escalation)	Transparency mandates: force cloud providers to disclose where physical compute is located	Circular economy: tax credits for refurbishing AI hardware, reducing demand from conflict zones
Helium shortage (chip manufacturing)	Domestic helium: invest in US helium extraction (BLM reserves in Texas, Kansas)	Chip efficiency: right-sized models don't need frontier chips — our 1.7 GB model proves it

The Closing Argument

"We are currently using the world's most vulnerable hardware (H200s from Taiwan) powered by the world's most volatile energy (natural gas from Hormuz)."

"To protect the climate and the country, we need distributed, transparent, and domestic compute. Our data proves we can do world-class clinical AI on right-sized, American-made hardware today."

"The 'Taiwan Threat' is a mirror of the 'Hormuz Crisis.' Both are caused by centralization. We have centralized our energy in the Middle East and our compute in the Taiwan Strait. To protect American science and the climate, we must decentralize."

Section 09

Compute Sovereignty.

Nations are racing to build self-sustaining AI infrastructure outside conflict zones. The winner doesn't just own AI — they own foresight.

Stargate Argentina: The Southern Compute Fortress

In October 2025, OpenAI and Sur Energy announced a $25 billion data center project in Patagonia — one of the largest AI infrastructure investments outside the US.

Location

Neuquén province, Patagonia. Geographically isolated from Middle East and Taiwan Strait conflict zones. Naturally cold climate reduces cooling costs. First phase: 500 MW capacity.

Energy independence

Sits on Vaca Muerta shale gas deposits and Limay River hydroelectric dams. Energy partnerships with Central Puerto and Genneia (renewable provider). Uses closed-circuit cooling — no river or sea water needed.

Why now

President Milei's RIGI (Incentive Regime for Large Investments) created the regulatory framework. The Hormuz crisis and Taiwan vulnerability make a Southern Hemisphere "compute island" strategically critical.

Timeline: First phase ($7-10B) — construction starts 2026, operational late 2027. Full buildout: $25B.

Sources: Infobae (OpenAI official), Data Center Dynamics, Bloomberg Línea, Zarego

Regional Context: South America's Shifting Alignments

The Stargate Argentina project doesn't exist in a vacuum. South America's political landscape has shifted dramatically in 2025-2026, directly affecting the feasibility of large-scale Western AI infrastructure in the region.

Argentina's Western pivot

Under Milei, Argentina has realigned sharply toward the US. The RIGI framework was designed specifically to attract projects like Stargate. Argentina has historically had territorial and political tensions with Venezuela — the two represent opposite poles of South American politics.

Venezuela and the US intervention

The January 2026 US intervention in Venezuela (Operation Absolute Resolve) removed a China-aligned government from the northern coast of South America. Venezuelan airspace had been restricted to US flights. The intervention opened the region's logistics and air corridors — the same corridors that connect Miami to Patagonia.

The strategic implication: the Western Hemisphere now has a clear north-south corridor from the US through a stabilized Caribbean to a US-aligned Argentina — with a $25B AI data center at the southern end and local energy independence from Vaca Muerta.

The Subsea Cable Buildout: Rewiring Global Connectivity

2025-2026 has seen the largest wave of subsea cable construction in history. These cables determine where data — and therefore AI — can physically operate.

Cable	Route	Builder	Strategic significance
Humboldt	Chile → French Polynesia → Australia (14,800 km)	Google + Chile (50/50 JV)	First-ever direct South America → Asia-Pacific link. 144 Tbps.
Project Waterworth	US → South Africa → India → Asia-Pacific → Americas	Meta	Multi-ocean system creating redundant global backbone
IOEMA	UK → Netherlands → Germany → Denmark → Norway	Consortium	Northern Europe compute corridor
Fiber-in-Gulf (FIG)	Gulf region interconnect	Regional	Gulf AI sovereignty — connects Saudi Hexagon to UAE compute

The theme: "connectivity with intent." These cables aren't about bandwidth — they're about strategic redundancy. Each one reduces dependence on a single chokepoint.

Sources: Interglobix Magazine, Google Cloud Blog (Humboldt), US State Department

The Global AI Infrastructure Race

Every major power is building "sovereign compute" — self-sustaining AI infrastructure insulated from adversaries.

Player	Strategy	Key move
United States	Western Hemisphere corridor	Stargate Argentina ($25B), CHIPS Act reshoring ($280B+), domestic fabs in Arizona
China	"Eastern Data, Western Computing" + UHV grid	1,400 GW renewables, 15 new UHV lines, SMIC 7nm workaround
Saudi Arabia	Sovereign AI spend	480 MW Hexagon data center in Riyadh — largest government-owned in the world
UAE	"Neutral compute hub"	OpenAI/Microsoft partnership, but under fire from Iran — neutrality tested
Russia	Energy spoiler	Profiting from $100+ oil while funding battlefield AI from stripped open-weight models
Israel	AI proving ground	Testing autonomous systems in Operation Epic Fury — real-world AI-first warfare

What "Winning" Looks Like

The game ends when one side achieves compute sovereignty — the ability to run their society, military, and economy on infrastructure that cannot be shut down by an adversary.

Predictive economic control

The nation with the most efficient compute can run models that predict market shifts, disease outbreaks, and supply chain disruptions months ahead. Foresight becomes the new currency of power.

Grid resilience

By using solar interconnects and right-sized compute, the winner decouples AI growth from residential grid strain. No voter backlash from rising bills. Innovation without the political cost.

The loser's fate

A nation that loses this race becomes a "compute colony" — forced to rent intelligence from foreign infrastructure. Their economic strategy, military planning, and scientific output become dependent on someone else's grid.

The National Security Frame

The 87% compute waste we measured on the H200 isn't just an efficiency problem. In a world where compute is the strategic resource, waste is a vulnerability.

The current state: We are running the world's most important AI workloads on the world's most vulnerable hardware (Taiwan), powered by the world's most volatile energy (Middle Eastern gas), at 11% efficiency.

The alternative: Right-sized, domestically manufactured hardware. Solar-interconnected, behind-the-meter power. Transparent utilization reporting. Our H200 data proves the science works on a $300 GPU. We don't need to wait for the next generation of chips from a conflict zone.

"Optimization isn't just about the climate — it's about making sure American science stays online when the rest of the world goes dark."

Section 10

Solar Interconnectivity.

Decoupling AI from the residential grid. Not all AI work is urgent.

Flexible vs Urgent

Training jobs (hours to days)

CAN shift to solar peak or off-peak hours. Our 3-hour run doesn't care if it starts at 2 PM or 2 AM.

Inference (real-time)

CANNOT shift. Hospital mortality predictions need answers now. Patients can't wait for solar peak.

Our Data Proves It

42 min

0.17 kWh

batch=256

3 hours

4.6 kWh

batch=32

Same scientific result. 27x less energy. Speed and efficiency aren't always trade-offs — sometimes better engineering gives you both.

Data center "flexibility" could save the US electricity system $40–150 billion over the next decade.

By avoiding new gas plants, reducing fuel costs, and shifting capacity toward renewables. Ross & Ewing, Duke Nicholas Institute, Feb 2026

When Should Training Run?

Training can run anytime — but the grid carbon cost depends on when. The incentive question: can rate structures make daytime solar cheaper than overnight gas?

High Med Low

SOLAR PEAK

Optimal training window

Your 42-min run fits here

Night training
(works, but gas-powered)

12AM4AM8AM12PM4PM8PM12AM

Solar generation

Optimal training window (low carbon)

Night training (works, higher carbon)

The reality

Training can run at 2 AM just as well as 2 PM. Many researchers run overnight because the cluster is less contested. There's no technical reason to prefer daytime. The question is purely economic: does the rate structure make solar hours cheaper?

The incentive

If utilities offered "green compute rates" — lower $/kWh during solar peak, higher during gas-heavy overnight — researchers and hyperscalers would naturally shift training to daytime. Not because they have to. Because it's cheaper. The grid gets cleaner without a mandate.

Virtual Power Plants: Households as Grid Nodes.

Instead of building gas plants for ghost peaks, what if data centers funded neighborhood batteries and solar that buffer the surge? That's a Virtual Power Plant — and data centers are the first customers willing to pay for it at scale.

What is a VPP?

Devices in homes — rooftop solar, batteries, EVs, smart thermostats, heat pumps — can act as mini-power plants or adjust energy usage in real time. Link thousands of them together to respond to grid spikes, and you have a Virtual Power Plant. No new land. No new gas turbines. Just smarter use of what's already installed.

37.5 GW

VPP flexible capacity today

80-160 GW

Goal by 2030 (DOE target)

7-13%

of US generating capacity

"VPPs have been waiting for a crisis and cash. Now we have both."

— Mark Dyson, Managing Director, RMI (Rocky Mountain Institute)

The crisis is data center demand outpacing the grid. The cash is hyperscalers willing to pay for power "as quickly as possible" — and impatient enough to circumvent the 3-5 year grid interconnection queue. Data centers may be the first customers to subsidize VPP programs at scale, funding residential batteries and solar in exchange for grid capacity that comes online in months, not years.

The old model: build for the ghost

Data center requests 7 MW of capacity.

Utility builds a gas plant for the peak.

Actual load: 0.93 MW. Ghost: 6.1 MW.

Gas plant operates for 30-40 years.

Residential bills go up to fund it.

Cost: $billions in stranded fossil assets.

The VPP model: buffer the surge

Data center funds residential solar + batteries in its service territory.

1,000 homes with batteries = distributed surge capacity.

When the data center hits peak, homes flex load down.

When the data center idles, homes charge from solar.

No gas plant needed. Residential bills go down.

Cost: fraction of a gas plant. Online in months.

Hyperscalers Are Already Interested

"Does a data center company ever want to say 'I won't run my training model for a couple hours on the hottest day of the year'? No. Instead, the opportunity is to pay other people to flex their load, or pay other people to adopt batteries that create headroom on the system."

— Mark Dyson, RMI

The model

Data centers provide capital to help customers in their service territory buy residential batteries — or contracts that guarantee a return for VPP aggregators. Hyperscalers get capacity unlocked quickly. Homeowners get subsidized batteries. The grid gets flexible.

The friction

~70% of US electricity customers are served by investor-owned utilities that earn profits by building infrastructure, not by reducing demand. Utilities lack natural incentive to support VPPs that shrink their capital expenditure pipeline. Regulation needs to catch up.

VPP Maturity: Where We Are

EnergyHub's VPP Maturity Model defines five levels. Most advanced VPPs today are at Level 2. The goal is Level 4-5: indistinguishable from conventional power plants.

L0

Basic demand response. Utility calls customer.

L1

Smart devices report data back to utility.

L2

Today: partial autonomy, hours of flex.

L3

Extended grid events. Multi-day automation.

L4-5

Goal: 24/7, indistinguishable from power plant.

The equity question

VPP benefits accrue to households that can afford solar + batteries. The cost of maintaining shared grid infrastructure falls on everyone. This echoes the rooftop solar equity debate. But unlike rooftop solar, VPPs also benefit non-participants — if the programs succeed in driving down system costs and improving reliability, everyone's bills decrease. The question is whether the benefits are distributed fairly, not whether they exist.

The counter-argument: utilities can build big batteries

Some researchers argue grid-scale batteries at the utility level achieve the same goals at lower unit cost, with benefits shared equally across all customers. VPP advocates counter that with 3-5 year interconnection queues, distributed resources deploy faster. The answer may be both — grid-scale storage for baseload, VPPs for rapid-response surge buffering. Neither requires a new gas plant.

Sources: Heatmap News: "Will VPPs Ever Really Be a Thing?" (Feb 2026) — Mark Dyson (RMI), Apoorv Bhargava (WeaveGrid), Matthew Plante (Voltus), Ryan Hanna (UC San Diego), Ben Hertz-Shargel (WoodMackenzie). VPP Maturity Model via EnergyHub. DOE 80-160 GW target via Jigar Shah (former DOE Loan Programs Office).

Section 11

Policy Levers for 2026.

The EU's 2026 Data Centre Energy Efficiency Package already requires disclosure. The US should match or exceed it.

01

Transparency Mandate

Per-job utilization, power draw, memory — reported quarterly. EPA requires emissions reporting. AI compute should too.

02

Ratepayer Protection

Impact fees. Behind-the-meter power. Demand response. Residential bills should NOT subsidize corporate AI training.

03

Right-Sizing Incentives

Tax credits for matching hardware to workload. Our 1.7 GB model on a 150 GB chip is a school bus for one person.

04

Grid-Interactive Grants

Federal funding for flexible scheduling. Solar-peak priority access. Datacenters that store energy for the grid get credits.

05

Protect Science

Research exempt from utilization mandates. Reporting yes, penalties no. A cancer researcher at 2% is still doing important work.

06

Climate Compute Card

Require standardized disclosure — GPU utilization, Ghost Ratio, energy, CO₂, grid carbon intensity — for every training run above a threshold. Cars got EPA stickers. AI compute needs the same. → Section 05

07

VPP Incentives

Allow datacenters to meet behind-the-meter requirements by funding neighborhood VPPs — residential solar + batteries that buffer grid demand. DOE target: 80–160 GW by 2030. → Section 10

08

Water Accountability

Mandate WUE disclosure. Cap peak withdrawal during droughts. Require liquid/dry cooling for new builds in water-stressed regions. 720 billion gallons/year by 2028 can't be invisible.

Section 12

Technology Solutions.

Technology	Policy Lever	Impact
Liquid / Immersion Cooling	Mandate for new facilities in water-stressed regions	70% water reduction vs evaporative
Closed-Loop / Dry Cooling	"Water-positive" mandates for facilities in drought-prone regions (AZ, TX, IA)	Near-zero direct water consumption
Virtual Power Plants	Datacenter impact fees fund neighborhood solar + batteries. Grid capacity in months, not the 3-5yr interconnection queue	80–160 GW by 2030 (DOE target)
Measurement Tooling (codecarbon, pynvml)	Open-source, free, works on any GPU today. The barrier is cultural, not technical — standardize and require	Enables CCC / Ghost Ratio disclosure
Efficient Architectures (MoE, Distillation)	R&D tax credits for compute-efficient model design. DeepSeek matched frontier performance at a fraction of the cost	Orders-of-magnitude compute reduction
UHV Transmission	Ultra-high voltage lines to move compute to where clean energy is. China's "Eastern Data, Western Computing" runs on this	Decouples location from load
Right-Sized Chip Allocation	Tax incentives for high fleet utilization	Direct idle energy reduction
Temporal Flexibility	Rate structures rewarding off-peak training	$40–150B grid savings / decade
On-Site SMRs (Small Modular Reactors)	Streamlined permitting for co-location	Decouples AI from public grid
Hardware Circularity	Extended producer responsibility for AI chips	Addresses 2-3yr chip e-waste

Appendix

Primary Data: 10+ Measured Experiments on NVIDIA H200

Run	Time	Energy	CO2	GPU %	Memory
LMM standalone	104 min	2.68 kWh	0.99 kg	11%	1.72 / 150 GB
GRASP+LMM	103 min	2.66 kWh	0.98 kg	11%	1.73 / 150 GB
GRASP+LMM+codemap (lr=5e-4)	177 min	4.60 kWh	1.69 kg	14%	2.58 / 150 GB
GRASP+LMM+codemap (lr=1e-4)	181 min	4.67 kWh	1.72 kg	12%	2.58 / 150 GB
GRASP+LMM+codemap (3 fixes)	161 min	0.61 kWh	0.22 kg	14%	1.67 / 150 GB
LMM standalone + batch=256	42 min	0.17 kWh	0.06 kg	16%	3.39 / 150 GB

Methodology: codecarbon (energy/CO2 from actual sensor readings) + pynvml (GPU metrics from NVIDIA driver) + torch.cuda (memory tracking)

Hardware: NVIDIA H200 NVL, 150 GB HBM3e, CUDA 12.6

Dataset: MIMIC-III (46,520 patients, public clinical data, Beth Israel Deaconess Medical Center)

Reproducibility: All code open source via PyHealth (github.com/sunlabuiuc/PyHealth). Any researcher with GPU access can replicate.

Reference

Glossary

GPU

The chip that runs AI models. One chip can cost $30,000+ and draw 700 watts.

H200

NVIDIA's most powerful datacenter GPU (2024). This is what we measured.

TDP

Max watts a chip CAN draw. Datacenters cool based on this, not actual usage — that's the provisioning gap.

Utilization

% of chip actually computing. Our average: 11%. Industry doesn't report this.

PUE

Building efficiency (total power / IT power). Hides chip-level waste. 1.0 = perfect, 1.3 = typical.

WUE

Water per kWh of IT energy. AI cooling is water-intensive. Not currently reported.

Behind-the-meter

On-site power (solar, battery, nuclear) that never touches the public grid.

Right-sizing

Matching GPU to workload. 1.7 GB model on 150 GB chip = parking a school bus in a compact spot.

SMR

Small Modular Reactor. Zero-carbon, always-on power. Can co-locate with datacenters.

codecarbon

Open-source tool measuring AI energy/CO2. Free, works on any GPU. The barrier is cultural, not technical.

LNG

Liquefied Natural Gas. Gas cooled to -260°F, shrinks to 1/600th volume for tanker shipping. How the US exports gas to Japan, Europe, Asia. US is the world's largest exporter at 15 Bcf/day.

Bcf/day

Billion cubic feet per day. Standard unit for natural gas production and exports. US produces ~118 Bcf/day total, exports ~15 Bcf/day as LNG.

Strait of Hormuz

21-mile-wide chokepoint between Iran and Oman. 20% of world's oil and gas passes through it. 93% of Japan's oil imports transit here. Closed by Iran in Feb 2026 after US/Israel strikes.

Cape of Good Hope

Southern tip of Africa. Alternative shipping route when Hormuz/Suez are blocked. Adds 10-14 days per voyage. Container rates +150% when tankers reroute.

VLCC

Very Large Crude Carrier. Oil supertanker carrying ~2 million barrels. Day rates hit record $423,736 during Hormuz crisis (+94% in one day).

Evaporative cooling

Data center cooling that evaporates water (like a swamp cooler). Cheap and energy-efficient but consumes water. Source of "one water bottle per query" headlines.

Closed-loop cooling

Data center cooling that recirculates coolant (like a car radiator). Near-zero water use but needs more electricity for fans. The solution for water-stressed regions.

Fossil lock-in

A gas plant built today operates for 30-40 years. Building new gas infrastructure for AI demand in 2026 locks in fossil fuel dependency through the 2060s. The plant doesn't care if renewables become cheaper later — it's already paid for.

Marginal demand

AI is the fastest-growing NEW source of electricity demand. It's the "tipping point" that justifies building new gas plants. Without AI demand, some proposed plants wouldn't clear the economic threshold to build.

Ghost Ratio

TDP ÷ actual power draw. Our H200: 700W ÷ 93W = 7.5×. The grid provisions for the ghost, not the load. Higher ratio = more invisible waste.

Climate Compute Card

Proposed standardized disclosure for AI workloads — GPU utilization, Ghost Ratio, energy, CO₂, grid carbon intensity. Like an EPA sticker for compute.

VPP

Virtual Power Plant. Thousands of home devices — rooftop solar, batteries, EVs, thermostats — linked together to respond to grid spikes. No new land, no new gas turbines.

Temporal flexibility

Shifting non-urgent AI work (training) to hours when clean energy is abundant. Our 42-min run doesn't care if it starts at 2 PM or 2 AM.

Demand response

Reducing or shifting electricity use during grid stress events. Datacenters can throttle training jobs when the grid is strained — inference cannot wait.

Foundation model

Very large AI models (GPT-4, Gemini, Llama) that DO need massive GPUs. The minority of workloads but the majority of energy headlines.

MoE

Mixture of Experts. Architecture where only a fraction of parameters activate per query. DeepSeek wakes only 5% of its weights at a time — same quality, fraction of the compute.

Distillation

Training a small, efficient model to mimic a large one. Transfers knowledge without transferring compute cost. Key efficiency technique.

HBM

High-Bandwidth Memory. The GPU's onboard memory. H200 has 150 GB of HBM3e — the fastest, most expensive type. We used 1.7 GB. Requires helium to manufacture.

pynvml

Python wrapper for NVIDIA's GPU driver. Reports real-time power draw, utilization, memory, and temperature. Paired with codecarbon for full measurement.

DeepSeek

Chinese AI lab that matched frontier model performance using MoE and distillation at a fraction of the training cost. Proof that algorithmic efficiency works.

TSMC

Taiwan Semiconductor Manufacturing Company. Fabricates 100% of the world's most advanced AI chips (2nm–5nm). Single point of failure for the global AI supply chain.

Silicon Shield

The theory that Taiwan's chip dominance protects it from invasion — China needs TSMC intact. Eroding as US/Japan/EU build domestic fabs.

CHIPS Act

US legislation funding domestic semiconductor manufacturing to reduce dependence on Taiwan. Accelerating fab construction in Arizona, Ohio, and Texas.

UHV

Ultra-High Voltage transmission. China's grid technology for moving power thousands of kilometers with minimal loss. Enables "Eastern Data, Western Computing" — data centers in the east, solar/wind in the west.

RIGI

Argentina's "Incentive Regime for Large Investments." Tax framework that attracted Stargate and other hyperscaler projects with 30-year stability guarantees.

Vaca Muerta

Argentina's massive shale gas formation. Second-largest shale gas reserve globally. Powers Stargate Argentina's energy independence from volatile import markets.

Model card

Standardized documentation for AI models — who built it, what data, known biases, intended use. The precedent for Climate Compute Cards: if models get safety disclosures, compute should get climate disclosures.

CAFE standards

Corporate Average Fuel Economy. 1975 US law requiring automakers to meet fleet-wide fuel efficiency targets. The precedent: cars got mandated efficiency before anyone asked. AI compute hasn't.

Powering AI Responsibly.

Key Takeaways.

Climate & Water Cost

Economic & Grid Fragility

Geopolitical Vulnerability

The Policy Path Forward

The Problem.

Data centers have always used energy. AI broke the efficiency curve.

The Grid & Your Electricity Bill

Carbon Backsliding

Community Impact

Water Consumption

The Optimization Trajectory

The Climate Cost.

Our Measured Cost

Extrapolated: All AI Research

Energy Scale: From Our Lab to Foundation Models

The H200 Data.

What we ran

How we measured

10+ Measured Experiments on NVIDIA H200

One parameter change. Everything shifted.

What 87% idle capacity actually means

What's NOT the problem

What IS the problem

The right-sizing opportunity

Precision Engineering.

The Infrastructure Ghost

Speed Saves Lives. Don't Slow Down Science.

The Real Research Workflow

If Chips Can Scale, The Grid Can Too.

What utilities see today

What transparency would reveal

The Visibility Gap: Hoarding by Omission

DeepSeek: Proof That Efficiency Is Power

The Climate Compute Card.

The Precedent Chain

Model Cards exist because releasing AI without documentation is irresponsible. The same is true for energy.

The Ghost Ratio is the new metric

The tools already exist

Cards protect responsible researchers

Cards expose ghost infrastructure

The precedent chain

How Data Centers Are Powered.

The current energy mix powering AI

The fossil fuel lock-in

Big Tech is buying its own power plants

The case FOR tech-owned nuclear

The case for concern

US Data Center Concentration & Expansion

The Geopolitical Energy Chain.

The Chain

Part 2: The US as Global Gas Superpower

LNG export boom

Two demand drivers sustain expansion

Part 3: The Strait of Hormuz Crisis (Feb 2026)

Japan: The Most Vulnerable Link

The Cape of Good Hope Reroute

Japan's Four Options — All Favor US Energy Dominance

The caveat

Where AI Compute Policy Breaks the Cycle

The China Benchmark

The Numbers: A Massive Lead

"Eastern Data, Western Computing" (东数西算)

The Coal Paradox

USA vs China: The Execution Gap

Why This Matters for US Policy

The Silicon Stranglehold.

Energy Chokepoint: Strait of Hormuz

Compute Chokepoint: Taiwan Strait

Taiwan's 11-Day Clock

China's Taiwan Threat: Not Just Energy — Territorial

The Erosion of the "Silicon Shield"

The Physical Link: Energy → Helium → Chips

From Chokepoints to Resilience: Policy Solutions

The Closing Argument

Compute Sovereignty.

Stargate Argentina: The Southern Compute Fortress

Regional Context: South America's Shifting Alignments

The Subsea Cable Buildout: Rewiring Global Connectivity

Powering AI
Responsibly.

AI as Catalyst,
Not Villain.