Why AI Safety is
The Hardest Problem

Benjamin Mann

Tech Lead, Anthropic

2024

The Stakes

AGI in 2028:
The Singularity Timeline

"Creating powerful AI might be the last invention humanity ever needs to make. If it goes poorly, it means a bad outcome for humanity forever. If it goes well, the sooner it goes well, the better."

50th percentile chance of superintelligence by 2028
Once we reach superintelligence, it will be too late to align the models
The window for getting alignment right is NOW, before AGI arrives
Economic Turing Test: when AI passes for 50% of money-weighted jobs

Why He Left OpenAI

Safety Wasn't Top Priority

Started at OpenAI in 2016 inspired by Superintelligence by Nick Bostrom
Architect of GPT-3, witnessed the scaling laws firsthand
As models grew more capable, safety concerns became clearer, not resolved
The path to AGI is now visible via language modeling, not theoretical
At OpenAI, safety wasn't the top organizational priority

The mission clarity

At Anthropic, best case = affecting the future of humanity. At Meta, best case = making money. Not a hard choice for mission-driven people.

The safety case concrete

In 2016, alignment felt theoretical. Today, language models demonstrably understand human values. Problem is hard but solvable.

The Transformation Ahead

We're Hitting the Knee of the Exponential

People are bad at modeling exponential progress — it looks flat then suddenly vertical
Customer service automation: 82% resolution rates without human involvement (Fin, Intercom)
Software engineering: Claude writes 95% of code, enabling 10-20X productivity per engineer
Right now: labor expansion is likely. Jobs transform, not disappear (yet)
Long-term: 20 years past singularity, capitalism itself may look completely different

The skepticism is rational

Most people don't feel AI impact yet because exponentials look flat at the beginning. Widespread transformation won't be visible until suddenly it is.

The transition period risk

The scary part isn't the far future (abundance). It's the next 20 years of massive job displacement and economic restructuring.

Playbook

Future-Proof Your Career

Use AI tools ambitiously, not conservatively — ask for the hard change, not incremental tweaks
Be willing to learn new tools constantly — old patterns of tool use break down
Legal and finance teams are seeing 10X productivity gains right now; that's not theoretical
Vulnerability is honest: even Ben and Lenny will be replaced eventually
The transition is what matters — prepare for rapid change, not permanence

The real advantageSuccess isn't about being immune. It's about being faster at learning and adapting than the pace of change itself.

Contrarian Truth

What Everyone Gets Wrong About AI

✗Progress is slowing downINSTEAD →✓ Progress is accelerating. Models release every month now, not yearly. We're time-compressing.

✗Safety is just PRINSTEAD →✓ Publishing failures builds trust with policymakers. Hiding risks is worse for credibility and society.

✗Alignment is impossibleINSTEAD →✓ It's hard, but language models understand human values. The problem is solvable if we start now.

✗AI won't affect my job soonINSTEAD →✓ It already is. 82% of support tickets close automatically. You're on the knee of the curve right now.

0:00

AGI in 2028:The Singularity Timeline

Safety Wasn't Top Priority

We're Hitting the Knee of the Exponential

Future-Proof Your Career

What Everyone Gets Wrong About AI

AGI in 2028:
The Singularity Timeline