Why Building AI Products Is
Fundamentally Different

Aishwarya Reganti
& Kiriti Badam

AI Researchers & Practitioners · Maven Course Instructors
OpenAI / ex-Amazon / ex-Google / ex-Microsoft

LENNY'S PODCAST

The Two Differences

Non-Determinism & the Agency Control Trade-Off

"You don't know how the user might behave with your product, and you also don't know how the LLM might respond to that. You're now working with an input, output, and a process you don't understand all three very well."

Non-determinism: both user input and LLM output are unpredictable — unlike traditional software
Agency control trade-off: every bit of autonomy granted to an agent means control relinquished by humans
Traditional software has a well-mapped decision engine; AI has a fluid, natural-language interface
With agentic systems, the uncertainty compounds at every step

The Framework

Continuous Calibration, Continuous Development

Scope the capability first — define expected inputs and outputs before writing a line of code
Curating a small dataset surfaces team misalignment on how the product should behave
Set evaluation metrics deliberately — not just LLM judges but real user signals
Deploy in low-stakes zones first, log human decisions to fuel the flywheel
Calibrate autonomy as trust is earned — expand agency step by step

50+

AI product deployments

35+

research papers (Ash)

rated AI course on Maven

3 yrs

old field — no playbooks yet

The Air Canada lesson An agent hallucinated a refund policy. The company had to honor it. Real-world cost of skipping the calibration loop: legal liability and eroded customer trust.

Pain is the new moat "Successful companies right now building in any new area, they are going through the pain of learning this, implementing this and understanding what works and what doesn't work." — Kiriti

Step-by-Step Playbook

Graduate Agency Deliberately: Three-Stage Ladder

Customer support V1: AI suggests reply → human agent approves & sends
Customer support V2: AI responds directly but escalates edge cases to humans
Customer support V3: AI handles end-to-end; issues refunds, files feature requests

Coding assistant ladder

V1: inline completion & snippets
V2: generate tests/refactors for human review
V3: apply changes & open PRs autonomously

Marketing assistant ladder

V1: draft emails or social copy
V2: build & run multi-step campaigns
V3: launch A/B tested, auto-optimized campaigns

"When you start small, it forces you to think about what is the problem that I'm going to solve. One easy slippery slope is to keep thinking about complexities of the solution and forget the problem."

Ways of Working

How Leaders Must Change to Build AI Well

Block daily time to personally use & learn AI — a CEO Ash worked with blocked 4–6 AM every day labelled "catching up with AI"
Leaders must get back to being hands-on — reading agent traces alongside engineers is the new product review
PMs, engineers, and data folks must share the same feedback loop — old handoffs are broken
Be comfortable that your intuitions might be wrong; assume you are the least-informed person in the room
Log everything humans do in the human-in-the-loop phase — it becomes your training data and flywheel fuel

On evals: the nuance "Evals" means different things to different people. Independent model benchmarks (LM Arena) ≠ product evals. Build evals specific to your use case; complement them with user signals and yes, some vibes — even Kodex at OpenAI does this.

Contrarian

AI Product Myths That Will Sink Your Roadmap

✗ Build a fully autonomous agent on day one to stay competitive INSTEAD → ✓ Start with a human-suggestion layer. Autonomy must be earned by building trust through logged, observable behavior — not assumed from the start.

✗ It's not about being first with an agent — move fast and ship INSTEAD → ✓ "It's not about being the first company to have an agent. It's about have you built the right flywheels so that you can improve over time." First-mover without a flywheel loses to second-mover with one.

✗ Comprehensive evals will tell you if your AI product is working INSTEAD → ✓ For complex agentic products, you will never write enough evals to capture emerging patterns. User signals, A/B tests, and deliberate human review are equally essential — and sometimes more so.

✗ Pain in AI product building means you're doing it wrong INSTEAD → ✓ "Pain is the new moat." The companies succeeding are going through the exact same struggle — the difference is they treat that pain as proprietary learning that compounds into durable advantage.