School of Computational Arts & Sciences Graduate 6 ECTS
AI610 · LLM Systems & Evaluation
Build production-grade LLM applications beyond prompting: retrieval and grounding, safety and policy checks, tool use, and systematic evaluation harnesses. Students implement test suites for quality, hallucination risk, and regression, then iterate on architecture with measured evidence.
- llm
- evaluation
- responsible-ai
| Overview | Details | Notes |
|---|---|---|
| Code | AI610 | — |
| Title | LLM Systems & Evaluation | — |
| School | School of Computational Arts & Sciences | — |
| Level | Graduate | — |
| Credits | 6 ECTS | — |
What you will learn
- · Build a retrieval-augmented generation (RAG) pipeline with measurable quality gates
- · Design safety checks (policy filters, refusal handling) and test them
- · Run offline and online evaluations and interpret trade-offs
Prerequisites
No formal prerequisites (or equivalents are accepted).
Assessment
| Component | Weight |
|---|---|
| Coursework | 60% |
| Final project | 40% |
Weekly outline
Week 1: Week 1
1 topics
- · LLM system anatomy: prompting, retrieval, tools, and caching
Week 2: Week 2
1 topics
- · Evaluation sets: gold answers, rubrics, and human review protocols
Week 3: Week 3
1 topics
- · RAG basics: chunking, embedding, ranking, and failure analysis
Week 4: Week 4
1 topics
- · Safety: policy filters, prompt injection, and data leakage