05
Evals & Observability
Evaluation is the production gate of AI systems: the discipline that catches regressions before your users do, and separates the few GenAI projects that ship from the many that stall.
Topics we expect - Eval-driven development in practice
- LLM observability and distributed tracing
- LLM-as-a-judge and human feedback loops
- Regression testing for non-deterministic systems
- Reliability engineering for AI features
We want to hear from: AI engineers, ML engineers, data scientists, and tech leads who run evals in production and can show how they changed engineering decisions.
06
Context & Memory Engineering
A large share of agent failures are context failures. Deciding what enters the context window, what gets retrieved, what gets remembered, and at what cost, has become an engineering discipline of its own.
Topics we expect - Context design beyond prompt engineering
- Retrieval strategies: agentic retrieval, RAG evolutions, long-context trade-offs
- Short-term and long-term agent memory as an infrastructure component
- Cost / quality / latency arbitration in context design
We want to hear from: AI engineers, data scientists, and software engineers who build LLM systems and have hard-won lessons about what belongs in the context.
07
Agentic Architectures & Orchestration
Agent architectures that ship, and a clear-eyed look at those that should not have been built.
Topics we expect - Architectures of production agentic systems, end to end
- When you do NOT need multi-agent systems
- MCP and tool integration at scale
- Orchestration patterns, failure modes, and post-mortems
We want to hear from: AI engineers, software engineers, architects, and tech leads running agents in production, especially the ones willing to share what failed.
08
Security, Sovereignty & Cost Control
Keeping control of your AI systems: who they talk to, what they can do, where they run, and what they cost.
Topics we expect - Prompt injection in the real world, and defenses that hold
- Identity, least privilege, and accountability for agents
- The MCP attack surface
- Open-weight and European models: self-hosting as a control decision
- Taming inference costs: model selection, caching, routing, FinOps for AI workloads
We want to hear from: AI engineers, security engineers, platform engineers, CISOs, and CTOs who secure, localize, or pay for AI systems at scale.