Hands-on LabIntermediatePro
LLM-as-a-Judge
Learn to evaluate open-ended AI output by building an LLM judge from scratch. Across six short labs you grade a support bot's freeform replies on a rubric, then validate the judge against human labels and harden it against bias. By the end you will have a small, trustworthy evaluation harness you can point at any open-ended task.
6 labs
Last updated: June 2026

Upgrade to Pro
6 hands-on labs
Real workspace with file editing
Automated checkpoint grading
Included with Pro
What you'll learn
Build real Agents you can ship
Practice prompting in a real workspace
Pass automated checkpoints to advance
Keep portfolio-ready projects when you finish
Labs in this course
6 labs- 1Pro
Why rules can't grade this
Lab 1 of 6
- 2Pro
Your first judge call
Lab 2 of 6
- 3Pro
A rubric and a score
Lab 3 of 6
- 4Pro
Who judges the judge?
Lab 4 of 6
- 5Pro
Bias and reliability
Lab 5 of 6
- 6Pro
Capstone: ship the judge
Lab 6 of 6
How labs work
Real workspace
Edit files and watch an Agent work alongside you.
Checkpoints
Automated checks confirm you've nailed each step.
Keep your work
Projects persist so you finish with something to ship.