On the Loop, Not In It โ But Code Quality Still Matters
On the Loop, Not In It โ But Code Quality Still Matters
Yesterday one of my AI agents wasted 15 minutes chasing a bug that didn’t exist. The function was called transformPayload() โ but it didn’t transform anything. It validated. The agent built three layers of transformation logic on top of it before realizing the name was a lie. I’ve seen this pattern dozens of times now. And it’s exactly why I think Kief Morris’s latest piece gets the big picture right but undersells one critical detail.
๐ฐ ๐๐ฒ๐ป๐๐ ๐๐ฑ๐๐ฒ๐ฟ๐๐ถ๐๐ถ๐ป๐ด ๐๐ผ๐ฒ๐ ๐๐ฒ๐ฒ๐ฝ๐ฒ๐ฟ ๐ง๐ต๐ฎ๐ป ๐ฌ๐ผ๐ ๐ง๐ต๐ถ๐ป๐ธโ๐๐ฟ๐ผ๐บ ๐จ๐ ๐๐ฎ๐ป๐ป๐ฒ๐ฟ๐ ๐๐ผ ๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ
๐ฐ ๐๐ฒ๐ป๐๐ ๐๐ฑ๐๐ฒ๐ฟ๐๐ถ๐๐ถ๐ป๐ด ๐๐ผ๐ฒ๐ ๐๐ฒ๐ฒ๐ฝ๐ฒ๐ฟ ๐ง๐ต๐ฎ๐ป ๐ฌ๐ผ๐ ๐ง๐ต๐ถ๐ป๐ธโ๐๐ฟ๐ผ๐บ ๐จ๐ ๐๐ฎ๐ป๐ป๐ฒ๐ฟ๐ ๐๐ผ ๐ง๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ
Just before heading out for my lunch run, I read the Reuters Institute for the Study of Journalism’s article “Advertising is coming for GenAI”[1],ย which is interesting, but from my perspective just scratching the surface. ๐ค
๐ง๐ต๐ฒ ๐ฆ๐๐ฟ๐ณ๐ฎ๐ฐ๐ฒ
The surface being, in a way, just the user interface which is used by human users to interface with some AI Agent. Here you can place ads, informed by the context of human-AI interactions, in multiple places:
๐๐ฟ๐ผ๐บ ๐๐ฎ๐น๐น ๐๐ฒ๐ป๐๐ฒ๐ฟ ๐๐ผ ๐๐ ๐๐ด๐ฒ๐ป๐ ๐๐๐ฏ: ๐ง๐ต๐ฒ ๐๐๐๐๐ฟ๐ฒ ๐ผ๐ณ ๐๐๐๐๐ผ๐บ๐ฒ๐ฟ ๐ฆ๐๐ฝ๐ฝ๐ผ๐ฟ๐ ๐๐ ๐๐ฒ๐ฟ๐ฒ
๐๐ฟ๐ผ๐บ ๐๐ฎ๐น๐น ๐๐ฒ๐ป๐๐ฒ๐ฟ ๐๐ผ ๐๐ ๐๐ด๐ฒ๐ป๐ ๐๐๐ฏ: ๐ง๐ต๐ฒ ๐๐๐๐๐ฟ๐ฒ ๐ผ๐ณ ๐๐๐๐๐ผ๐บ๐ฒ๐ฟ ๐ฆ๐๐ฝ๐ฝ๐ผ๐ฟ๐ ๐๐ ๐๐ฒ๐ฟ๐ฒ
Just returning from an internal Amazon Connect deep[1] dive. I haven’t touched this particular product since maybe 5 years?! Dirk Frรถhner โ I’m sure you remember our joint large-scale workshop with one of your customers.
What stuck with me from that time was how fast you can actually set up a contact center in the cloud โ less than 30 minutes from zero to the first call received โ and how much AI was already improving both the customer and agent experience back then. That hasn’t changed.
Most comprehensive overview on RAG I have seen. We came a long way from vanilla RAG. Still remember
Most comprehensive overview on RAG I have seen. We came a long way from vanilla RAG. Still remember the time of arguments that RAG is just a โhot fixโ to be obsolete soon. Reality is it is not a fix but the backbone of the majority of enterprise applications.
Kudos to Jin for putting this togehter. Should go to every practitionerโs back pocket !
๐ง ๐ง๐ต๐ฒ ๐ ๐ฎ๐ถ๐ป๐๐ฒ๐ป๐ฎ๐ป๐ฐ๐ฒ ๐ง๐ฟ๐ฎ๐ฝ: ๐ช๐ต๐ ๐ฌ๐ผ๐๐ฟ ๐๐ง ๐ฆ๐๐๐๐ฒ๐บ๐ ๐๐ฟ๐ฒ ๐ ๐ผ๐ฟ๐ฒ ๐๐ถ๐ธ๐ฒ ๐ฃ๐น๐ฎ๐ป๐๐ ๐ง๐ต๐ฎ๐ป ๐ฆ๐๐ผ๐ป๐ฒ๐
๐ง ๐ง๐ต๐ฒ ๐ ๐ฎ๐ถ๐ป๐๐ฒ๐ป๐ฎ๐ป๐ฐ๐ฒ ๐ง๐ฟ๐ฎ๐ฝ: ๐ช๐ต๐ ๐ฌ๐ผ๐๐ฟ ๐๐ง ๐ฆ๐๐๐๐ฒ๐บ๐ ๐๐ฟ๐ฒ ๐ ๐ผ๐ฟ๐ฒ ๐๐ถ๐ธ๐ฒ ๐ฃ๐น๐ฎ๐ป๐๐ ๐ง๐ต๐ฎ๐ป ๐ฆ๐๐ผ๐ป๐ฒ๐
After years of watching organizations struggle with outdated systems, I’ve written about a pattern we all know too wellโthe maintenance trap in IT.
Here’s the uncomfortable truth: We’ve all seen those systems that haven’t been updated in years. Aging interfaces, accumulating bugs, mounting security risks. We assess the cost of updates, weigh the business value, and often decide to “just skip this one.”
IT System Maintenance in the age of AI
IT System Maintenance in the age of AI
Introduction - The Maintenance Trap in IT
You don’t need to be in the IT industry for long to have witnessed this firsthand. Even non-IT users do. Those systems that haven’t been maintained for ages. From a user perspective, you “just” see a maybe aged user interface, non-evolving features, and old bugs or quirks become accepted by, possibly generations of, users. From a user perspective, you should have an eye on this. Often, this not only means that the system becomes cumbersome to use, but it also means that there are possibly no security updates being made. We will see just in a bit that it might even not be possible anymore. So think about which kind of data you want to put in there.
What Anthropic is describing is the weaker version of this technique: applied externally, without pe
What Anthropic is describing is the weaker version of this technique: applied externally, without permission, at an adversarial scale. Yet 16 million exchanges suggest MiniMax found the weaker version worth the effort. Thatโs the structural problem: if meaningful capabilities can be extracted even from outputs alone, the model itself is a depreciating asset. The moat is the rate of improvement, not todayโs benchmark score. The labs that can stay ahead of the distillation cycle have defensible positions. The ones that canโt are selling last quarterโs capability at this quarterโs price." - straight to the point. Thanks Julien for this insightful analysis and sharing it. TIL - a lot!
From my perspective balancing AI Agents Agency with Control is one of the most important themes for
From my perspective balancing AI Agents Agency with Control is one of the most important themes for 2026. We need to get this right both as builders and users for AI Agentic systems.
Anthropic’s study “Measuring AI agent autonomy in practice”[1] nicely fits into this as the started to study autonomy of ai agents based of usage of Claude Code and tool invocations. Already this first iteration provides some nice insights. Obviously a high focus on Coding use cases, but also indicating a wide variety of other use cases which resonate with my experience from customers I’m working with.
๐ฏ 'How do we pick the RIGHT AI agent use case?
๐ฏ “How do we pick the RIGHT AI agent use case?
This is the question I hear most from customers exploring agentic AI.
Here’s the mechanism I run through together with the customer:
The 4-Quadrant Evaluation
When a customer brings me 5-10 agent ideas, we structure each one across four dimensions:
๐ Business Value & Strategic Fit โ What pain does it solve? For whom? How often? โ Can we quantify the impact? (Revenue, cost, time, quality) โ Which KPI moves if this works for 6 months?
Passing on control to your AI coding agent team entirely?
Passing on control to your AI coding agent team entirely?
Anthropic researcher Nicholas Carlini conducted a stress test of their Claude Opus 4.6 model by deploying 16 parallel AI agents to build a complete C compiler in Rust from scratch(https://lnkd.in/eGMp4b2K). Over approximately two weeks and nearly 2,000 Claude Code sessions, the agents autonomously produced a 100,000-line compiler capable of compiling the Linux 6.9 kernel across multiple architectures (x86, ARM, and RISC-V). The experiment cost around $20,000 in API fees and demonstrated that coordinated AI agent teams can tackle complex systems programming challenges traditionally requiring significant human expertise and architectural oversight.