Hands-on LabIntermediatePro

LLM-as-a-Judge

Learn to evaluate open-ended AI output by building an LLM judge from scratch. Across six short labs you grade a support bot's freeform replies on a rubric, then validate the judge against human labels and harden it against bias. By the end you will have a small, trustworthy evaluation harness you can point at any open-ended task.

6 labs

Last updated: June 2026

Upgrade to Pro

6 hands-on labs

Real workspace with file editing

Automated checkpoint grading

Included with Pro

What you'll learn

Build real Agents you can ship

Practice prompting in a real workspace

Pass automated checkpoints to advance

Keep portfolio-ready projects when you finish

Labs in this course

6 labs

1
Why rules can't grade this
Lab 1 of 6
Pro
2
Your first judge call
Lab 2 of 6
Pro
3
A rubric and a score
Lab 3 of 6
Pro
4
Who judges the judge?
Lab 4 of 6
Pro
5
Bias and reliability
Lab 5 of 6
Pro
6
Capstone: ship the judge
Lab 6 of 6
Pro

How labs work

Real workspace

Edit files and watch an Agent work alongside you.

Checkpoints

Automated checks confirm you've nailed each step.

Keep your work

Projects persist so you finish with something to ship.