Expert Evals Are Creating the Fastest-Growing AI Cos
Brendan Foody
Founder; expert eval company builder
SEP 18 2025
The Opportunity
Expert Knowledge + AI Evals = Moat
"The companies that will win in AI are not building better models — they're building better evaluation systems with domain experts."
Domain experts writing evals = defensible moat that can't be replicated with scale alone
Medical AI needs doctors. Legal AI needs lawyers. Financial AI needs CFA holders.
The expert eval flywheel: better evals → better models → more expert demand → better evals
Brendan's thesis: expert eval companies are the fastest-growing category in AI infrastructure
Framework
The Expert Eval Business Model
$2B+
expert eval market size
100×
value of expert eval vs crowdsourced
3 years
to build a defensible expert network
The network is the moat: expert annotators don't leave once they're integrated
Domain specificity: one-size-fits-all doesn't work for expert evals
Trust and verification: expert credentials must be verified, not just claimed
The pricing power: expert evals command 10-50× the price of commodity annotation
Brendan's insightThe bottleneck for AI improvement is not compute or architecture — it's the availability of domain experts who can evaluate model outputs.
Expert Eval Categories
Where Expert Evals Create Most Value
The medical AI bottleneck
Every medical AI company needs doctors to validate outputs. Building a vetted doctor network is a 3-year investment.
The legal compliance case
Legal AI must be evaluated by lawyers for hallucination detection. The liability risk makes expert evals mandatory.
Playbook
Build Expert Eval Capacity
Identify your 3 highest-stakes AI outputs — those need expert evals, not crowdsourced
Build relationships with domain experts before you need them at scale
Create a certification process: how do you verify expert qualification for your domain?
Offer performance-based compensation: experts who write better evals should earn more
The competitive moatAn expert eval network takes 3+ years to build at quality. It's the AI moat that money can't buy quickly.
Contrarian
Eval Business Myths
✗Crowdsourcing scales expert qualityINSTEAD →✓ Crowdsourcing scales volume. Expert quality doesn't scale — it's earned through credentials and judgment.
✗AI can evaluate its own outputsINSTEAD →✓ AI can evaluate syntactic correctness. Domain experts evaluate semantic correctness and safety.
✗Models will soon not need evalsINSTEAD →✓ Better models need harder evals. The bar rises with capability.
✗One eval framework fits all domainsINSTEAD →✓ Domains require custom eval frameworks. Medical, legal, and code evals share almost no methodology.