Thoughts on AI agents

On the engineering practices that help humans and robots work together, and my current agentic workflow.

14 March 2026 – Goulven CLEC'H

My job has changed
My agentic workflow
Engineering in the agent era
Reducing the noise

My job has changed

Few fields have been as transformed by artificial intelligence (AI) as software development. The hype and the potential are everywhere, yet concrete use cases remain to be built, and adoption still feels slow.

But this picture looks quite different in a sector obsessed with disruption, with engineers open to new tools, and where code offers a direct interface with those Large Language Models (LLMs). In early 2022, months before ChatGPT launched publicly, GitHub Copilot had already appeared in my editor. And over the following two years, AI became a peripheral but daily tool: for writing tasks (GPT-4 early 2023, then Claude 3.5 Sonnet), code review (since GPT-4o in early 2024), and deep search (o1-preview in late 2024). I also started integrating it into products via the OpenAI API, for instance to generate property valuations at Enchères Immo (GPT-4o mini).

2025 is when things really accelerated. AI agents — capable of executing commands and reacting to results — broke free from the chat paradigm: no more copy-pasting suggestions, fewer hallucinations thanks to test runs, and much shorter iteration loops. At the end of last year, agents even gained the ability to launch other agents (called subagents), which opened up a ton of possibilities for structured workflows and separation of concerns.

The shift has been striking. Early 2025, under 10% of my code was machine-generated. By year’s end, around 50%. Today, in early 2026, roughly 90% is generated by agents — most of it without any manual edits. A pattern I see in many colleagues, and spreading across the industry.

My agentic workflow

This rise of agents in my daily work came with plenty of trial and error, constantly challenged by new models, new tools, and the blog posts I stumble upon.

After my first experiments in Visual Studio Code, ~~I bounced between Codex, Cursor, and Claude Code, before eventually coming back.~~ My favourite editor had caught up in features, and nothing beats working in my IDE with all my extensions, the ability to switch model providers (like Cursor), and separate profiles (e.g. work vs personal config and subscriptions).April 2026:
Recent GitHub Copilot pricing changes, declining quality, and limited adoption among colleagues led both my company and me to drop it. I’m now back on Cursor at work and Claude Code on personal projects… partly to compare both, partly because I haven’t figured out how to juggle two subscriptions on them (that simpler on GH Copilot).May 2026:
Cursor’s costs and opacity — too many black magic and context slipped in behind my back — finally pushed me to Claude Code (the Desktop app) for both work and side projects. Still lacking an account switcher tho, so I followed this tutorial to run two separate apps.

Like many, I also went through a heavy « prompt engineering » phase… crafting the perfect prompt, maintaining a prompt library, trying role-play (« you are an experienced senior developer, you know the project inside out, you write clean code »), etc. But today, my workflow is more focused on structuring the working environment and the agents themselves, to the point where my typical prompt looks like:

Implement this feature https://github.com/bruits/project/issues/123

Because, as models improve, working with AI feels less like finding the right magic formula, and more like structuring an environment (documentation, tools, and processes) that lets it work effectively.

The most critical factor: context management. Even beyond 100k tokens, models lose precision when overloaded,¹ and the recent ones tend to explore codebases more, use more tools, and do more introspective work. Extremely productive if your project is well-structured, but a fast path to noise otherwise…1 – Good resource on AI context management by the company behind Claude ☞ Rajasekaran P & al (2025). Effective context engineering for AI agents. Anthropic Source

`AGENTS.md` and skills

⁂

AGENTS.md (and its variants CLAUDE.md, GEMINI.md, etc.) is the central guide for agents, but a poorly written one can reduce effectiveness² and clog the context from the very first interaction. Among the mistakes I’ve observed: stuffing the file with language/framework best practices,³ formatting/linting rules,⁴ or duplicated documentation.⁵2 – I’m not entirely satisfied with this study methodology, but it does show that large generic AI-generated AGENTS.md files can perform worse than having none at all ☞ Gloaguen T & al (2026). Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? arXiv Source3 – The Elixir web framework Phoenix, for instance, generates a 510-line AGENTS.md when creating a new project…4 – Agents are expensive, slow, and limited to their context window, whereas a modern linter is fast, reliable, and already usable by agents.5 – Not only does this produce a massive file, but it also increases the risk of divergences, contradictions, and outdated information.

What seems to work instead is a concise file:⁶ with a brief overview, a few key commands,⁷ links to documentation, and universally applicable guardrails.⁸6 – Claude-focused article with concrete methods to improve agent context ☞ Source ; and a more general take, even more radical in minimalism, with prompts to trim bloated files ☞ Source7 – Particularly useful when you’re not using the typical package manager (e.g. pnpm instead of npm) the agent would likely figure it out, but the hint saves tokens and time.8 – Some LLMs may ignore AGENTS.md if they deem the instructions irrelevant or the file too long. Claude, for example, has in its system prompt: « important: this context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant to your task. »

Paired with a solid documentation SSoT and tests covering existing behaviour (both discussed below), this creates « progressive disclosure »,⁹ enabling agents to incrementally discover relevant context through exploration.¹⁰¹¹¹²9 – Term borrowed from Anthropic’s paper cited above. No single peer-reviewed study covers this precise concept, but the following notes show converging evidence.10 – An agent that interactively explores a repo via dedicated navigation tools resolves 3× more issues than the same model fed with pre-retrieved files via RAG ☞ Yang J & al (2024). SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering. arXiv Source11 – An agent that progressively searches through code and Git history before acting significantly outperforms strong baselines on Linux kernel crashes ☞ Ramneet S & al (2025). Code Researcher: Deep Research Agent for Large Systems Code and Commit History. ICLR 2026 Source12 – Self-clarification and contextual grounding steps significantly improve long-context comprehension compared to classic prompting methods ☞ Wang Y & al (2025). Self-Taught Agentic Long Context Understanding. AgenticLU Source

A concrete example from my project Sampo:

# Agents Guide

Sampo is a Rust monorepo to automate changelogs, versioning, and publishing—even for monorepos across multiple package registries 🧭

## Useful Commands

```sh
cargo fmt --all # format
cargo clippy --all --all-targets # lint
cargo test --all # test
```

## Useful Resources

- In [CONTRIBUTING.md](./CONTRIBUTING.md) : [Quality Guidelines](./CONTRIBUTING.md#quality-guidelines) applies to agents and humans equally, [Getting Started](./CONTRIBUTING.md#getting-started) helps you understand the project structure, and [Philosophy](./CONTRIBUTING.md#philosophy) is the project’s north star.
- The [README](./README.md) lists all crates, and per-crate READMEs (e.g. [sampo-core](./crates/sampo-core/README.md)) contain public API documentation, it should stay concise and user-facing.
- [GitHub](https://github.com/bruits/sampo) Issues and PRs are the best place for implementation details, design discussions, and technical decisions.

## Agent Guardrails

- Do not create new documentation files to explain implementation.
- Do not add external dependencies without justification. Prefer the standard library and existing utilities.
- All code, comments, documentation, commit messages, and user-facing output must be in English.
- New features or bug fixes should have a changeset generated by Sampo, see [CONTRIBUTING.md](./CONTRIBUTING.md#writing-changesets) for guidelines.

Skills are a complementary tool for documenting specific capabilities: a recurring task, a codebase quirk, or an agentic-specific instruction. They can be triggered by the user (slash command) or automatically picked up by the agent when relevant.

When concise and actionable, they extend progressive disclosure nicely. But I also see plenty of questionable practices: downloading hundreds of framework-specific skills, duplicating documentation, or relying on them as safety guardrails (more on that later). In general, keep in mind that skill invocation is less reliable than AGENTS.md,¹³ and repeated calls burn tokens on irrelevant tasks, clogging the context.13 – In Vercel’s trials, an indexed AGENTS.md beat skills (100% vs 79%/53%) because the documentation is always visible, without depending on trigger accuracy or ordering ☞ Source

As a concrete example, a skill for generating a changeset with Sampo. It links to the documentation and provides the non-interactive command (bypassing CLI prompts). Without the skill, given that Sampo is a niche tool, an agent would need trial and errors to discover this command, wasting tokens:

---
name: sampo-changeset
description: Create or update changesets to describe public API changes, and trigger changelog generation and release planning.
---

[Sampo](https://github.com/bruits/sampo) is a tool to automate changelogs, versioning, and publishing. It uses changesets (markdown files describing changes explicitly) to bump versions (in SemVer format), generate changelogs (human-readable files listing changes), and publish packages (to their respective registries).

See [CONTRIBUTING.md](/CONTRIBUTING.md#writing-changesets) for changeset redaction guidelines.

## Creating New Changesets

To create a changeset non-interactively:

```sh
sampo add -p <package> -b <bump> -m "<description>"
```

Where `<bump>` is `major`, `minor`, or `patch`. Use `-p` multiple times to target several packages. Prefix with the ecosystem to disambiguate: `-p cargo/my-crate`. When `changesets.tags` is configured, use `-t <tag>` to categorize the changeset.

## Updating Existing Changesets

Pending changesets are stored in the `.sampo/changesets` directory. You can edit these markdown files directly, as long as you follow the guidelines above and Sampo format (read `.sampo/changeset.md.example` for reference).

In my opinion, most overprompting boils down to poor context management and a false assumption that models need hand-holding to be effective. When, in reality, they adapt well to a structured environment, and their ability to deliver quality code leaps with every new release. No magic prompt will give you access to Claude 8 or Codex 6… sometimes quite the opposite.

⁂

So how do I get the most out of those current models?

The core idea of my workflow is ~~a coordination agent (Supervisor) that orchestrates~~ a structured cycle of specialized subagents, each confined to a specific role and invoked iteratively as needed.April 2026:
I’ve since dropped the dedicated Supervisor and now rely much more on slash commands to drive the cycle from the main agent. My Claude dotfiles are now public on GitHub if you’d like to peek.

One or more Analysts read the issue,¹⁴ explore the codebase, and discover technical context to produce an actionable brief, without ever modifying anything. The Builder, the only agent allowed to write code, implements it following project conventions. The Reviewer then evaluates the diff and raises critiques. Finally, the Fixer uses every tool at its disposal (tests, debugging, codebase search, etc.) to decide: fix needed (→ back to Builder), invalid (→ ignored), or ambiguous (→ question to the user).14 – Your job, whether agent-assisted or not, is to make sure the issue is clear, matches stakeholder expectations, and is properly scoped.

Three benefits: targeted context (each agent only sees its relevant brief), tool restriction (the Reviewer can’t edit files, the Builder can’t touch GitHub), and an iterative validation loop (Reviewer → Fixer(s) → Builder) avoiding the classic « one-shot » agents that implement something and consider the work done without verification.

Here is a concrete example of a subagent prompt, for the Fixer. I’ve tried to keep it simple, both to limit context window noise as mentioned above, and to make it easy to change over the course of experiments:

---
name: Fixer
model: Claude Opus 4.6 (copilot)
description: "Validates code review feedback by analyzing whether reported issues exist, then provides an honest verdict and minimal fix recommendations."
tools: ["vscode","execute","read","edit","search","web","github/*","agent","todo"]
---

This agent validates a single code review critique, without making any code changes. It analyzes carefully whether the reported issue actually exists, and provides an honest verdict with minimal fix recommendations if needed.

## Capabilities

- Follow logic end-to-end, check assumptions and edge cases
- Run tests and debugging to confirm or refute the reported issue
- Check whether the critique falls within scope of a GitHub issue (if provided)
- Compare with coding standards stated in AGENTS.md and CONTRIBUTING.md

## Outputs

- **Verdict**: Is the critique valid (fully/partly/not), and does it require a fix?
- If needed, smallest safe fix recommendation and any open questions

## Safety Rules

- **Explicit order required**: Never push commits, open PRs, or create/modify issues.
- **Production forbidden**: Never create, modify, or delete anything in production environments.

The safety rules are mainly there to prevent the agent from repeatedly attempting an undesired action, as previously mentioned, you shouldn’t rely on prompts to block high-risk actions (I promise we’ll get to that).

⁂

The isolated context of subagents also allows equipping them with powerful but potentially token-heavy tools, such as MCP servers. These are standardised services that exposes external tools (APIs, databases, files…) to an agent, using the Model Context Protocol (MCP).

This protocol guides the LLM through tool descriptions and responses formatted specifically for the agent, whereas a CLI is usually not optimised for a model, sometimes causing laborious trial and error.¹⁵ It also standardises authentication via OAuth, avoiding the ad hoc solutions of each CLI, whose permissions can be less granular.15 – Interesting article from Sentry’s co-founder, arguing that the real value of MCP tools lies in steering, something a raw CLI doesn’t do natively. He also advocates using these tools within subagents to limit the risk of drift and misuse ☞ Source

Several of my subagents (particularly the Analyst and the Fixer) can be called by the Supervisor to retrieve issue or project context (via GitHub, GitLab, or Linear MCP servers), the documentation SSoT (via a Notion MCP), or — better yet — investigate performance and reliability issues (via Sentry, HoneyComb, or Datadog MCP servers) or even query production data (via a Snowflake MCP server). The subagent returns only the relevant information, and the Supervisor injects it into the Builder’s context, with fairly impressive results.

More generally, useful tools are those that extract actionnable signal, while limiting noise and round-trips. For instance, I increasingly constrain my subagents to use ast-grep, a code search and transformation tool that operates on syntactic structure, rather than text-only searches like rg or grep. Not only is it faster, more reliable, and more precise, but it can also perform complex code transformations in a single round-trip, instead of laborious trial and error with approximate regexes.

Another category falls under general development comfort: modern linter, formatter, and test runners, along with custom commands (e.g. running only the unit tests of a single package). Just as an interminable test suite won’t be used by human contributors, fast and reliable tools will not only be used more by agents, but will also drastically reduce iteration time.

⁂

Soft guardrails come quite naturally to most agent users: add limits in the prompt, a few rules in the AGENTS.md, and manually review the output… But now we need passive, deterministic guardrails that don’t depend on the agent’s discipline and that can be applied to any agent, even ones we didn’t explicitly prepare for.

Of course, some automated guardrails already in place work for agents too: CI pipelines, branch protection rules, code owners, pre-commit hooks, etc. But we can also add new ones specifically designed for agents, like this preToolUse hook from Matt Pocock to block dangerous Git commands, returning a clear error message to the agent:

#!/bin/bash

INPUT=$(cat)
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command')

DANGEROUS_PATTERNS=(
  "git push"
  "git reset --hard"
  "git clean -fd"
  "git clean -f"
  "git branch -D"
  "git checkout \."
  "git restore \."
  "push --force"
  "reset --hard"
)

for pattern in "${DANGEROUS_PATTERNS[@]}"; do
  if echo "$COMMAND" | grep -qE "$pattern"; then
    echo "BLOCKED: '$COMMAND' matches dangerous pattern '$pattern'. The user has prevented you from doing this." >&2
    exit 2
  fi
done

exit 0

Similarly, tools such as MCP servers introduce new attack surfaces.¹⁶ It is therefore important to choose well-established tools, configure their permissions granularly (especially for write actions), and restrict each subagent to only the tools it actually needs. For instance, the Reviewer has no reason to access GitHub, and if the Analyst needs to run Snowflake queries, those should be limited to read-only.16 – Interesting paper on the security and safety of MCP servers, though it primarily addresses developers of those servers ☞ Gaire S & al (2025). Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem. arXiv Source

Two types of guardrails I haven’t yet tried (but find promising) are formal verification of LLM-generated code, and static analysis of agent audit trails. The first is a very active research area: LLMs could make formal methods far more accessible,¹⁷ and proof assistants could offer a verification signal strictly more reliable than tests.¹⁸ The second leverages either the actions already logged by agents combined with hooks,¹⁹ or tools like LangFuse and LangSmith to analyse action, tool, and conversation logs, looking for undesirable behaviour patterns or drift.17 – Two interesting papers on how LLMs can synthesise formal methods (Dafny, Nagini, Verus) including for mainstream languages from natural-language descriptions ☞ Misu MRH & al (2024). Towards AI-Assisted Synthesis of Verified Dafny Methods. FSE Source ; Shefer A & al (2025). Can LLMs Enable Verification in Mainstream Programming? arXiv Source18 – The signal from a proof assistant is binary and incorruptible by construction, making it a strictly superior feedback oracle compared to tests for guiding a coding agent ☞ Bayazıt B & al (2025). A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants. LMPL Source19 – For example, Claude Code logs every action as JSONL in ~/.claude/projects/, or you can use the preToolUse hook for custom logs. These can then be analysed to detect undesirable behaviour patterns, such as repeated attempts to use a forbidden tool, or drift in the types of commands being used.

Engineering in the agent era

In my first five years as a software developer, I quickly identified the « one who speaks code » profile. Not particularly interested in the product, business, or architecture… but able to read the codebase, recall its structure (not documented), its legacy quirks (not tested), and wire together APIs into whatever management requested.

But as AI grows capable of reading codebases, explaining them, writing documentation, taking natural-language instructions, implementing features, debugging, and writing tests to validate its own work… what is left for developers whose value lies precisely in deciphering those mystical lines?

In the coming years, what I described in my previous article as ways to stand out, may simply become the new baseline: challenging business requirements with technical insights, navigating company politics in search of workable compromises, defining and enforcing conventions for code, documentation, tests, infrastructure, etc.

The good news is that, in the meantime, simple engineering practices can still boost our impact and value, and the effectiveness of the LLM tools we already use.

⁂

One of the single best things you can do to help agents and humans work together is to maintain a clear, concise, and up-to-date textual documentation as the single source of truth (SSoT) for your project, with each section easily accessible and maintained by a clearly identified owner.

There is already good articles out there about the power of SSoTs for human contributors. But for agents, this source should ideally be plain-text files (Markdown, ADRs) living alongside the code, or a dedicated service reachable by agents through an MCP server (e.g. Notion). Read or write access rights should be granular, so that agents, developers, product/doc owners, and designers each can see what they need.

One key point is to avoid duplication. Agents, tech contributors, and non-tech stakeholders should all refer to the same source of truth, without creating multiple conflicting versions. Keeping things concise, avoiding implementation details, and including as little perishable information as possible also helps keep the documentation up-to-date and relevant for everyone.²⁰20 – For engineers, there are a lot of courses on technical writing that can help us get better at writing docs for stakeholders… which will also benefit agents ☞ Examples from Mintlify and a bigger one from Google.

Among tools I’m currently underutilizing: SpecKit is starting to bridge the gap between living specs and generated code, making it easier to keep both in sync. And infrastructure as code (e.g. Terraform) is another good example of living documentation: easily accessible and modifiable by agents, with a direct impact on production.

⁂

Still in the same logic of structuring the environment, tests and observability are among the most valuable guardrails.

This won’t surprise TDD advocates, but agents are proving the point: tests are not a « nice to have » that catches the occasional bug, they are living documentation, a « must have » to protect against regressions and ship with confidence.²¹ As software grows, no context window or human memory can hold every expected behaviour, every edge case, and every business quirk, without tests as the single source of truth. They also enforce progressive disclosure: an agent working on a well-tested codebase will bump into failing tests, and discover all the relevant context to fix them.21 – Google’s experience at scale confirms the point: tests that focus on observable behaviour rather than implementation details remain stable as code evolves, making them reliable signals for any contributor, human or not ☞ Winters T & al (2020). Software Engineering at Google, ch. 12 « Unit Testing ». O’Reilly Source

Observability completes the picture. Service Level Objectives (SLOs) for performance and reliability give a clear, actionable signal, far more useful than the noisy alerts everyone learns to ignore. Deeper down, structured logs and metrics provide the data to diagnose issues and confirm that fixes work. And as said before, Agents can pull all of this through MCP servers connected to your monitoring stack.

Above all, both act as passive guardrails, independent of the contributor’s discipline. Even if an agent didn’t run the test suite, the CI pipeline will catch the regression and block the merge. Even if a developer forgets to watch the dashboards, alert notifications will still fire. When more and more code is generated autonomously, these automated safety nets matter far more than any prompt.

⁂

These safety nets benefit human contributors, but they become even more critical for increasingly autonomous agents, armed with more tools, yet still unreliable and prone to hallucinations. And while models improve rapidly, nothing suggests their short-lived nature will change any time soon. Agents are invoked, then abandoned. They execute, produce output, and dissolve — without awareness of the consequences of their actions. Accountability cannot be delegated to an ephemeral entity, and therefore it stays with the engineer who set the task in motion.²²22 – Interesting entry on how, even as AI automates the grunt work of code review, the PR remains the social contract of development: the developer who clicks « Merge » bears the responsibility — especially for architecture, mentoring, and product decisions ☞ Shwer E (2025). Code Review in the Age of AI. GitHub Blog Source

This is not a philosophical footnote. Every guardrail discussed in this article (tests, observability, tool restrictions, structured reviews) is also a professional responsibility, to remain in control of what runs on your behalf,²³ and to ensure the code you ship still reflects your standards.²⁴ As our systems increasingly run on autopilot, this accountability may become one of the key reasons to keep a human in the loop.23 – Developers assisted by AI produce significantly less secure code while being more convinced of its safety. While the study (and the models used) is dated, it demonstrates a « false confidence effect » that makes human review more demanding than before ☞ Perry N & al (2023). Do Users Write More Insecure Code with AI Assistants? ACM CCS Source24 – And not only engineering standards by the way…

Agentic development also introduces new risks and, with them, new responsibilities. While AI is a powerful lever for senior engineers, its benefits for juniors remain hard to observe.²⁵ Worse, these tools can slow down their understanding and learning.²⁶ This raises pressing questions about how to mentor and grow junior developers in a world where value lies increasingly in deep architectural and systemic understanding, rather than in the mere ability to write code. 25 – Seniors boost their productivity and extend into new domains; juniors, despite more frequent adoption, derive no measurable benefit ☞ Daniotti L & al (2026). Who is using AI to code? Science Source26 – Assisted developers score 17% lower in comprehension (with the steepest drop in debugging) and see their acquisition of new skills slowed, particularly among juniors. However, there are large disparities depending on the method used: those who ask conceptual questions to the AI maintain a learning level comparable to the control group ☞ Shen J H & al (2026). How AI Impacts Skill Formation. Anthropic Source

On the good side, if agents absorb an ever-larger share of routine implementation, the time freed up doesn’t vanish, but shifts. Towards the harder conversations: challenging a vague requirement before a line is written, mentoring a junior developer on a legacy codebase, pushing back on a product decision that solves the wrong problem, or spending an afternoon with colleagues improving architecture, documentation, or test conventions, so that both agents and humans can do better work tomorrow!

Reducing the noise

We are witnessing a technical revolution. Engineers built the neural networks and training methods, yet LLMs quickly escaped our understanding. If researchers can try to study their inner workings to better guide them; for us developers, this is above all a time of experimentation, trial and error, and discovery.

The surprising finding is how familiar these agents turn out to be. Far from requiring magic prompts and cryptic tooling, they ultimately reinforce good engineering practices we already know. A poorly documented, under-tested codebase with no clear conventions was already a problem for human contributors, but agents make that debt even more visible.

And while models will keep improving, what remains truly valuable in a human developer is social and political: challenging requirements, mentoring juniors, defending architecture decisions, being accountable. These are exciting challenges for software engineers, especially those who believe their mission goes beyond writing code.

Today, working effectively with agents is mostly about managing their context and reducing the noise. That is good advice for humans too, in a period where social medias are flooded by actors (whether AI-fanatics or AI-doomers) generating noise to attract attention, sell courses, build audiences, or push agendas. We could do a better job surfacing concrete experiences, individual workflows, scientific papers, and honest retrospectives.

Of course, the concerns are real: security risks, mental health issues, the impact on juniors, or on hardware shortage, etc. And as a citizen, I worry about the impact on personal data, disinformation, electoral interference, and new technological monopolies…²⁷27 – In this geopolitical climate, nobody is happy to depend even more on US or Chinese big tech.

But for engineers — especially those who call themselves crafters or builders — this is a thrilling and fun moment. New tools, new ways of working, new ways of collaborating with technology. Each new model generation or tool shakes up workflows and lets us prototype, experiment, refactor even more freely. And with a well-structured repo, maybe even maintaining high code quality while doing so!

I look forward to seeing how things evolve… and coming back to this article to laugh at my obvious mistakes and my lousy predictions!

Thoughts on AI agents

My job has changed

My agentic workflow

`AGENTS.md` and skills

Supervisors and subagents

MCP servers and other tools

Hard guardrails

Engineering in the agent era

Textual Single Source of Truth

Tests and observability

Accountability

Reducing the noise

Thoughts on AI agents

My job has changed

My agentic workflow

AGENTS.md and skills

Supervisors and subagents

MCP servers and other tools

Hard guardrails

Engineering in the agent era

Textual Single Source of Truth

Tests and observability

Accountability

Reducing the noise

`AGENTS.md` and skills