Fischbrötchen and Failure Rates — I'm Speaking at AWS Summit Hamburg
Fischbrötchen and Failure Rates

Hamburg won me over.
Last year, the AWS Summit left Berlin for Hamburg. After years of presenting at the Berlin Summit, I wasn’t sure how I’d feel about the move. Then I opened the Generative AI track to a packed room — people standing in the back — and spent the rest of the day in conversations that reminded me why these events matter. The Fischbrötchen at the Landungsbrücken afterwards sealed the deal. Hamburg won me over [1].
On May 20th, I’m back on stage.
AIM001: From Demo to Deployment — Solving Agentic AI’s Toughest Challenges
📅 May 20, 2026 · 📍 Hamburg Messe 🔗 Full agenda — search for AIM001

The prototype-to-production chasm. Most AI projects die here.
Here’s the number that frames the session: 46% of AI proof-of-concept projects are scrapped before they ever reach production [2]. The proportion of companies abandoning most of their AI initiatives has risen from 17% to 42% in a single year. Not because the technology doesn’t work. The demos are impressive. But the path from “works on my laptop” to “serves a thousand users reliably” is where most projects quietly die.
I’ve been analysing these gaps and sharing my findings with you over the past months:
- Why AI coding productivity is 10%, not 10x, and why that’s actually fine [3]
- Why your AI models have an expiry date, and what to do before they expire [4]
- Why security is job zero when agents can act autonomously [5]
- Why RAG is still needed even with million-token context windows [6]
- Why the shift from cloud-native to AI-native is a discipline, not a feature [7]
Each of these articles explored one facet of the production challenge. The Summit talk brings them together into a structured framework: four pillars that separate demos from deployments.
The Four Pillars

Operational Excellence, Data and Context, Trust, Reliability.
Operational Excellence — Can you see what’s happening and control what you’re spending? Your PoC cost €200 a month. Your first production invoice? €47,000. Unlike traditional applications, agents make decisions you didn’t explicitly program. Without observability into every reasoning step and tool call, you’re debugging a black box at scale.
Data & Context — Is your agent grounded in reality? Most agent failures aren’t model failures. They’re context failures. Wrong document retrieved. Missing metadata. No memory of what the user said last week. As I wrote in my RAG article: the model is the student, retrieval is the open book [6]. Without the right book, even the best student fails. And unlike a search engine, an agent acts on what it retrieves. Bad context doesn’t just produce a wrong answer. It triggers wrong actions.
Trust — Is it safe? Can you control what the agent does? One user is easy. A thousand users with different roles, different data access, different compliance requirements? That’s where the CodeWall/McKinsey story becomes relevant: a $20 AI agent autonomously found SQL injection in a production system that traditional scanners missed [5]. Agents can be both the thing you’re securing and the thing attacking you. Session isolation, identity-scoped access, and real-time policy enforcement aren’t optional at scale.
Reliability — Does it work consistently? Agents are non-deterministic. The same question can produce different answers. In a demo, that’s interesting. In production, it breaks trust. You need evaluation pipelines that run continuously, not just at launch. And you need to know when to use deterministic code instead of an LLM call. A Python function runs in milliseconds, costs nothing, and gets the same answer every time.
The session covers these pillars with real deployment stories and patterns you can apply immediately. If you’re attending the full AI track, think of AIM001 as the map: it shows you the terrain, and the 14 other sessions in the track let you go deep on the areas that matter most to your situation.
Help Me Sharpen the Talk

Demo on the left. Production on the right. The gap is where the work happens.
The framework is solid and the content is refined with real customer stories. What I’m looking for now is calibration: are there production challenges specific to our region that deserve more attention? Fresh war stories that confirm or challenge the patterns?
If you’re working on agentic AI, I’d appreciate your perspective:
- What broke when you moved your agent from demo to production?
- Where did cost surprise you most?
- How are you approaching testing for something that’s inherently non-deterministic?
- What’s the production challenge you haven’t found a good answer for yet?
Share your experiences in the comments or reach out directly.
Join Us in Hamburg
📅 May 20, 2026 📍 Hamburg Messe 🎟️ Free registration: aws.amazon.com/de/events/summits/hamburg/
Beyond AIM001, the AI track features 14 additional sessions: AgentCore deep dives, automated reasoning, model customization, and customer stories from Deutsche Bahn, Siemens, Delivery Hero, Würth, and Infineon. Whether you’re building your first agent or scaling an existing one, there’s a session for you.
See you in Hamburg. I’ll be the one at the Fischbrötchen stand between sessions.
Sources:
[1] AWS Summit Hamburg 2025 — Uncompressing: https://schristoph.online/blog/aws-summit-hamburg-2025-week—uncompressing/
[2] S&P Global / 451 Research, “Generative AI Shows Rapid Growth but Yields Mixed Results,” Oct 2025
[3] AI Coding Productivity: 10%, Not 10x: https://schristoph.online/blog/ai-productivity-10-percent-not-10x/
[4] Your AI Models Have an Expiry Date: https://schristoph.online/blog/model-lifecycle-management/
[5] Security Is Job Zero: https://schristoph.online/blog/security-is-job-zero/
[6] Is RAG Still Needed?: https://schristoph.online/blog/is-rag-still-needed/
[7] From Cloud-Native to AI-Native: https://schristoph.online/blog/from-cloud-native-to-ai-native/