Scale AI's $14B Meta Deal
& What's Actually Next

Jason Droege

CEO, Scale AI

OCT 9 2025

The Company

Scale AI: The Data Layer
Under Every Frontier Model

"Every major AI model you use was trained with Scale's help. We're the infrastructure that makes AI reliable."

Scale provides data labeling, RLHF, and AI evaluation services
The $14B Meta deal: landmark enterprise AI contract
Training data quality determines model quality more than architecture
The government bet: defense AI is Scale's fastest-growing vertical

Framework

The Scale AI Model

$14B

Meta multi-year contract

$13.8B

Scale valuation

10+

frontier model customers

Scale's core service: human-in-the-loop data annotation at massive scale
RLHF platform: the feedback loop that aligns models to human values
Eval-as-a-service: ongoing model evaluation for enterprise customers
Government/defense: separate unit building AI for US national security

Jason's thesisScale is not a services company. It's an AI infrastructure company that happens to need humans to deliver the infrastructure.

Enterprise AI at Scale

What Scale Learned

Insight 1: Data quality compounds — better training data produces dramatically better models
Insight 2: Enterprise AI needs continuous evaluation, not just pre-deployment testing
Insight 3: Government AI is 5 years behind commercial AI but catching up fast
Insight 4: The data moat is real — it's harder to replicate than the model architecture

The data moat

Scale's most defensible asset is not its technology — it's its relationships with expert annotators in specialized domains.

The government opportunity

US defense AI is a multi-decade market. Scale is positioning as the trusted data partner for national security AI.

Playbook

Think About AI Data

Your training data IS your model — invest in its quality before model selection
Evals are not a one-time task — build continuous evaluation into your AI roadmap
Domain-specific data is worth 10× general data for domain-specific tasks
The data flywheel: more users → more data → better model → more users

The Scale predictionJason believes the AI training data market will be larger than the cloud market within 5 years. The data that trains AI is the new oil.

Contrarian

AI Data Myths

✗More data is always betterINSTEAD →✓ More high-quality data is always better. Scale's value is quality, not just volume.

✗Open source data eliminates the need for ScaleINSTEAD →✓ Open source data produces open source quality. Frontier models require frontier data quality.

✗AI will soon generate its own training dataINSTEAD →✓ AI-generated training data causes model collapse. Human feedback remains essential for alignment.

✗Data labeling is a commodityINSTEAD →✓ Expert data labeling is never a commodity. Medical, legal, and defense labeling require rare expertise.

0:00

Scale AI: The Data LayerUnder Every Frontier Model

The Scale AI Model

What Scale Learned

Think About AI Data

AI Data Myths

Scale AI: The Data Layer
Under Every Frontier Model