👥 Team & Culture📊 MindMap

The Benevolent Dictator Protocol

by Hamel Husain & Shreya ShankarCo-Founders of the 'Build Your Own Evals' Course at Consulting / UC Berkeley

Hamel is a machine learning engineer with experience at GitHub and Airbnb, now a leading AI consultant. Shreya is a computer scientist and researcher at UC Berkeley, specializing in ML operationalization. Together, they run the top-rated course on Maven about building AI evaluations.

🎙️ Episode Context

This episode demystifies 'Evals' (evaluations) for AI products, arguing they are the highest ROI activity for AI teams. Hamel and Shreya demonstrate a practical workflow starting from manual error analysis ('open coding') to building automated 'LLM-as-a-Judge' systems. They challenge the misconception that evals are just unit tests, framing them instead as a continuous data analysis process that replaces traditional PRDs for AI agents.

🎯

Problem It Solves

Analysis paralysis where teams debate what counts as a 'good' response, slowing down the iteration cycle.

📖

Framework Overview

Instead of design-by-committee, appoint one domain expert (often the Product Manager) to define the 'ground truth' for evaluations. Their taste becomes the standard to align the model against initially.

🧠 Framework Structure

💡
The Benevolent Dictato...
1️⃣

Single Source of Truth: One person's ...

2️⃣

Domain Expertise: The dictator must u...

3️⃣

Speed over Consensus: Prioritize gett...

When to Use

Early stage development or when establishing a new class of evaluations for subjective tasks.

⚠️

Common Mistakes

Letting engineers define product quality without domain context, or trying to average opinions from a group.

💼

Real World Example

Deciding that a recruiting email starting with 'Given your background...' is bad/generic, based solely on the PM's taste, and optimizing against that.

"
"

You don't want to make this process so expensive that you can't do it. You can appoint one person whose taste that you trust.

Hamel Husain & Shreya Shankar

Keywords

#benevolent#dictator#protocol#team#culture
Share: