Executive Summary

Table of Contents

90% of the developers change the AI model if their agent makes mistakes. That is the wrong strategy.

The AI agent is not the model. The agent is the Harness - the software infrastructure that makes the model productive. This harness consists of instructions, tools and user messages. It determines whether your agent delivers consistent results or systematically fails.

In this article you will learn:

  • What an agent harness is and why the model is only 10%
  • The 3 core components that define every agent
  • Why 90% teams fail (and how you can do better)
  • 4 concrete steps for a production-ready agent harness

Based on 400+ trained software developers and 25 years of IT/AI consulting experience.

Ready for consistent agents?

Option A: You are at the beginning of your KI-DEV journey - Then take a look at our 12 weeks DEV AI Bootcamp and builds real AI-first habits.

♦

Option BOr are you already good with AI and need to go faster, then take a look here: Agentic Coding Hackathon


What does a poorly configured AI agent really cost companies?

Last week at the DEV AI Bootcamp: A team of software developers told me about their Β«AI agent disasterΒ».

They had invested six months:

  • 3 different models tested (Claude, GPT-4o, Gemini)
  • 50+ custom prompts written
  • Building an Β«AI firstΒ» team
  • Regular prompt optimizations carried out

The result? Chaos.

The agent wrote code that was sometimes brilliant, sometimes completely off the mark. Sometimes it followed the conventions, sometimes it ignored them completely. Sometimes tests were included, sometimes not. The code review process became a rollercoaster.

The CTO told me: Β«We thought the model was the problem. So we changed. Three times. Nothing changed.Β»

The real problem: The missing agent harness.

They had no system that consistently controlled the agent. No central Agent.md with clear rules. No standardized tools. No structured workflows.

They were like cab drivers without a steering wheel - the engine was powerful, but there was no control.

The model was interchangeable. The missing harness was not.

This problem is costing the industry millions. According to a study by McKinsey from 2024 70% of the AI implementations lack of integration and processes - not the technology itself.

And it is 100% avoidable.


Why are agent harnesses indispensable in 2025?

AI agents are no longer an experiment - they are a production standard.

According to the Stack Overflow Developer Survey 2024 already use 76% the developer AI tools in their daily work or are planning to do so. The GitHub Octoverse Report 2024 shows: Projects with GitHub Copilot have 55% more pull requests per developer.

But here's the uncomfortable truth: most teams treat AI agents like advanced autocomplete tools. They prompt, hope, iterate. Without a system. Without a strategy.

That works for prototypes. Not for production.

In Production you need:

  • Consistency: The agent writes code with the same style patterns, security practices, best practices
  • Repeatability: The same input should (almost always) produce the same output
  • Scalability10 agents, 100 features, 1000 commits per month - all this must be manageable

This is exactly where the agent harness comes into play.

Agent harness (definition):

The software infrastructure that turns an AI model into a productive agent. It comprises instructions (rules), tools (capabilities) and user messages (control).

The term comes from AI research. A Β«harnessΒ» is the infrastructure that makes a model productive. In self-driving cars, it is the sensor fusion, the safety layer, the decision framework. For software agents, it is the combination of instructions, tools and workflows.

In 25 years of IT consulting, I have seen many trends come and go. Agent harnesses are not a trend. They are the new foundation of modern software development.


What is an agent harness and how does it work?

An agent harness is the software architecture that turns an AI model into a productive agent. It consists of three inextricably linked components:

Component Function Example
Instructions Project-specific rules and guidelines Agent.md with tech stack, code style, dos/don'ts
Tools Available capabilities and integrations GitHub, terminal, code search via MCP server
User Messages The way you control the agent Precise prompts with specific requirements

Note: The model only accounts for 10%. The harness determines the other 90%.


What role do instructions play for consistent agent results?

Instructions are the operating system of your agent. You determine WHAT the agent should do - not generic prompt guidelines, but specific, measurable rules for YOUR project.

What belongs in a production-ready Agent.md:

# Agent.md: Project Auth-Service

## Tech Stack
- Language: TypeScript 5.x, strict mode
- Testing: Vitest, coverage target >80%
- Framework: Express.js 5.x

## Code style rules
- Use ES Modules (import/export)
- See src/auth/login.ts as a template for error handling
- Component structure: utils/ for generic functions, services/ for business logic

## Workflows
1st **feature**: Write specs β†’ Tests β†’ Implementation β†’ Review
2. **Bugfix**: Root cause analysis β†’ Minimal fix β†’ Tests β†’ Review
3. **Refactor**: No functional change, tests remain green

## Dos
βœ“ Write tests BEFORE implementation
βœ“ Use example files as reference
βœ“ Run TypeCheck and linting after every change

## Don'ts
βœ— Do not use any types
βœ— No console.log in production code
βœ— No breaking API changes without discussion

Why does it work?

Instead of vague instructions (Β«write good codeΒ»), you give concrete, measurable rules. The agent can reference these rules during each session. This makes its behavior predictable.

The most important thing: The Agent.md is loaded automatically with every session. The agent knows the rules without you having to explain them each time.


What tools does an AI agent need for maximum productivity?

Tools say WHAT the agent can work with. The agent cannot work without the right tools. An agent without GitHub integration cannot create PRs. An agent without a terminal cannot run tests.

Examples of tools (MCP server):

GitHub Integration:
  - Read file from repo
  - Create/Update pull requests
  - Check CI/CD status

Database Access:
  - Query database schema
  - Execute migrations
  - Check data models

Terminal:
  - Run tests (npm run test)
  - Linting (npm run lint)
  - TypeCheck (npm run typecheck)

Code Search:
  - Find similar patterns in codebase
  - Search for function definitions

MCP server (Model Context Protocol): An open standard from Anthropic that enables AI models to interact with external tools and data sources in a structured way. Find out more at modelcontextprotocol.io

Tools make the agent independent and productive.

The best combination: Agent.md (What) + MCP tools (How) + Your prompts (Why).


How do I formulate prompts that deliver consistent results?

Instructions and tools are static. User messages are the daily interface to your agent. The way you promote determines success or failure.

Comparison: Vague vs. precise prompt

❌ Vag Precise
Β«Implement a login featureΒ» Β«Implement backend login according to the pattern in src/services/auth.ts. POST /login with email + password. Return JWT token (15 min validity). Tests with Vitest, >80% coverage. See tests/auth.test.ts for test patternΒ»

The precise prompt delivers 10x better results, because it specifies the requirement.

Prompt engineering with Agent.md:

With a production-ready Agent.md you need fewer prompt details. The Agent.md provides the context.

πŸ’­ Old model (without Agent.md):
   Prompt: 300 words + explain all conventions

πŸ‘ New model (with Agent.md):
   Prompt: 50 words + Agent.md has the rest

Why do 90% of all agent harness implementations fail?

If you're thinking Β«That sounds easy, why don't all teams do it?Β» - here are the most common mistakes:

Error 1: Agent.md is too generic

Problem:

❌ Agent.md with 500 lines of copy-paste from other projects
   "Code style should be good, tests are important, DRY principle..."

Solution:

βœ… Agent.md with 50 lines of concrete project rules
   Tech stack: TypeScript 5.x strict, Vitest
   Template: See components/Button.tsx for style
   Tests: Pattern from __tests__/button.test.ts, >80% coverage

Concrete beats generic by 100:1.

Error 2: No tool integration

Problem:
Agent has access to files, but no GitHub integration. Result: Agent can write code, but cannot push. You have to push manually.

Solution:
Set up MCP server for: GitHub, Terminal, Code Search. Agent becomes 10x more productive.

Error 3: Instructions are constantly changing

Problem:
You tell the agent one rule on Monday and another on Wednesday. Agent gets confused. No consistent behavior.

Solution:
Agent.md is the Single Source of Truth. Changes go into the Agent.md, not in the prompt.

Mistake 4: Too many prompts per session

Problem:
Feature should be built in 1 prompt, but it needs 10 iterations. Agent and human lose the context.

Solution:
Structure the work in cycles:

  1. Write specs (prompt 1)
  2. Tests (Prompt 2)
  3. Implementation (Prompt 3)
  4. Code review (Prompt 4)

Short, focused prompts with clear output.

Ready for consistent agents?

Option A: You are at the beginning of your KI-DEV journey - Then take a look at our 12 weeks DEV AI Bootcamp and builds real AI-first habits.

♦

Option BOr are you already good with AI and need to go faster, then take a look here: Agentic Coding Hackathon


What does a successful agent harness look like in practice?

A team from the DEV AI Bootcamp came with the following problem:

Before (without production-ready harness):

  • 3 weeks per feature (with AI agent)
  • 40% of the PRs were rejected (code quality issues)
  • Agent made the same mistakes over and over again
  • Each prompt required 50+ words of instruction

After 1 day boot camp (harness workshop):

We built together:

  1. Agent.md with 12 clear rules for your project
  2. MCP server for GitHub + integrated database
  3. Prompt templates for Feature/Bugfix/Refactor

After 2 weeks with optimized harness:

Metrics Before Afterwards Improvement
Feature development time 3 weeks 3-4 days 6x faster
PR rejection rate 40% 8% -80%
Repeated errors Frequently 0 -100%
Prompt length 100 words 10 words -90%

The CTO (anonymized):

Β«We spent 6 months tweaking prompts. The Agent.md file did more in one day than all the prompt optimizations put together.Β»

Business-Impact:

  • Code review time: -40%
  • Bugs in Production: -75%
  • Agent independence: +90%

How do you build a production-ready agent harness in 4 steps?

Step 1: Standardize your instructions (30 minutes)

Action:

  1. Create .ai/rules/agent.md in the project root
  2. Document:
  • Tech stack (language, versions, important libs)
  • Commands (Build, Test, Lint, Typecheck)
  • Code Style (with reference file: Β«See components/Button.tsxΒ»)
  • 3-5 Dos
  • 3-5 Don'ts

Example structure:

# Agent.md: Auth-Service

## Tech Stack
- TypeScript 5.x (strict mode)
- Express.js 5.x
- PostgreSQL 15

## Commands
npm run test # Run tests
npm run typecheck # Type-Check
npm run lint # ESLint

## Code Style
See src/auth/login.ts as a template.
Always typed Errors, never any-Type.

## Dos
βœ“ Tests BEFORE implementation
βœ“ Think about error cases

## Don'ts
βœ— No any-types
βœ— No console.log in Prod

Success check: Agent should be able to format correctly without further prompts.


Step 2: Integrate your tools (45 minutes)

Action:

  1. List available tools on
  2. Make sure that Agent has access to it
  3. Test each tool with a simple example

Typical tools:

  • GitHub (Create PR, Push Code)
  • Terminal (Run Tests, Lint)
  • Code-Search (Find Patterns)
  • Database (Check Schema)

Test:

Β«Agent: Create a pull request for the new featureΒ»

If the agent can do this, tools are configured.


Step 3: Define standard workflows (60 minutes)

Action:
Document in Agent.md, how agents should work.

## Workflow: Feature Implementation

1. **Specs**: Make requirements clear, no ambiguous requirements
2. **Tests**: Tests-first, acceptance criteria as tests
3. **Implementation**: Minimal code until tests are green
4. **Refactor**: Code quality, no function change
5 **Review**: Self-review for best practices

## Workflow: Bugfix
1. root cause analysis (not only patch symptoms)
2. minimal fix (not overengineering)
3. tests for bug (so that it does not happen again)
4. verification: bug is gone

Success check: Next time you have an agent build a feature, it should automatically follow this workflow.


Step 4: Iterative improvement (continuous)

Action:
Ask after each agent session:

  • What rules did the agent ignore?
  • What mistakes does he regularly make?
  • What could be clearer?

β†’ Update Agent.md.

Example:

  • Agent writes code without tests? β†’ Agent.md Add rule: Β«Always write tests firstΒ»
  • Agent does not follow code style? β†’ Specify a more specific template file.

Pro Tip: An error is a feedback signal. Use it.


The 5 most common agent harness errors (and their solutions)

Problem Cause Solution
Agent ignores code style Agent.md too generic, no concrete reference file Pin concrete example file: Β«See components/Button.tsx as a template. Use exactly this structure.Β»
Inconsistent test coverage No clear TDD rule Add rule: Β«Tests BEFORE implementation, do not change tests during green phaseΒ»
Agent makes the same mistake repeatedly Error correction only via prompt Error as a don't rule in Agent.md document
Tools are not used Tools not configured/tested MCP server setup, test simple tool call
Context explosion after 10 prompts Too many files pinned Use agent for code search, pin only reference files (max. 3)

 


Summary: Agent Harness Essentials

The most important thing:

  • The agent harness (Instructions + Tools + User Messages) is more important than the model
  • 90% of the agent problems are harness problems, not model problems
  • A good harness makes results consistent, repeatable, scalable

Can be implemented immediately:

  1. Create your first Agent.md (30 min) - Project-specific rules, not generic copy-paste
  2. Define 3-5 clear rules per category (code style, tests, workflows)
  3. Iterate based on mistakes - If a mistake happens twice, it belongs in the Agent.md

Business-Impact:
Ship teams with production-ready harnesses 6-20x faster and have 75% fewer bugs (according to real production metrics).

This is not hype - it is measurable and reproducible.


πŸš€ Learn agent harnesses in practice

Free resources

πŸ“„ Agent.md Template: github.com/obviousworks/agentic-coding-rulebook
Production-ready template with everything you need!

Our training

Option A: You are at the beginning of your KI-DEV journey - Then take a look at our 12 weeks DEV AI Bootcamp and builds real AI-first habits.

♦

Option BOr are you already good with AI and need to go faster, then take a look here: Agentic Coding Hackathon


Stay up to date

πŸ’¬ LinkedIn Community: linkedin.com/in/matthiasherbert

πŸ™ GitHub: github.com/obviousworks


Do you need support with AI transformation?

At obviousworks.ch, we offer hands-on consulting and in-depth support - from strategic assessment to successful implementation. No theory, but tried and tested strategies for Swiss companies.

Let's talk: https://www.obviousworks.ch/kontakt/

 

Suitable training courses

Agentic Coding Hackathon

Be on course in 3-5 days!

FAQs

K
L
What is the difference between Agent Harness and Prompt Engineering?Your Title Goes Here

Prompt Engineering optimizes individual inputs. Agent harness is the entire infrastructure - instructions, tools, workflows. A good harness makes intensive prompt engineering superfluous.

K
L
Does an agent harness work with all AI models?

Yes, the harness is model-agnostic. Whether Claude, GPT-4o, Gemini or Llama - the same instructions and tools work. That's why the harness is more important than the model.

K
L
How long does it take to build a production-ready harness?

4-8 hours for the basic structure. Then continuous improvement. Most teams see significant improvements after just 1 week.

K
L
What is an Agent.md file?

One Agent.md is a Markdown file in the project root that documents all rules, code style specifications and workflows for AI agents. It is loaded automatically with every session.

K
L
What are MCP servers for AI agents?

MCP (Model Context Protocol) is an open standard that enables AI models to interact with external tools (GitHub, terminal, databases). MCP servers are the concrete implementations of these integrations.

K
L
Do I need programming knowledge for an agent harness?

Basic knowledge is helpful, but not a prerequisite. The Agent.md is a simple Markdown file. MCP servers can often be configured with just a few clicks.

Matthias (AI Ninja)

Matthias puts his heart, soul and mind into it. He will make you, your team and your company fit for the future with AI!

About Matthias Trainer profile
To his LinkedIn profile