• Home
  • Komodor Blog
  • From Blueprint to Production: Building a Kubernetes MCP Server

From Blueprint to Production: Building a Kubernetes MCP Server

As Large Language Models (LLMs) evolve from simple chatbots into agentic workflows, the need for a standardized way to connect them to external data and infrastructure has become critical. In a recent workshop hosted by Nir Adler, Innovation Engineer at Komodor, we explored how to bridge this gap using the Model Context Protocol (MCP).

This post details how to build an MCP server that connects AI agents (like Claude Desktop or Cursor) to a Kubernetes cluster, enabling natural language control over kubectl operations. To learn more, watch the full workshop here.

What is the Model Context Protocol (MCP)?

MCP is an open protocol designed to standardize how AI agents communicate with external systems. Before MCP, developers had to build specific integrations for every different AI provider (OpenAI, Anthropic, etc.). MCP solves this by providing a unified protocol based on JSON-RPC that works across any agent that supports it.

An MCP server is built upon three main pillars:

  1. Resources: These act like GET requests for static context. They allow the agent to read data, such as file contents or, in our case, Kubernetes cluster contexts.
  2. Tools: These are executable functions. They allow the LLM to take action, such as running a kubectl command or decoding a secret.
  3. Prompts: These are pre-made templates for specific workflows. Instead of writing a long context every time, you can trigger a “Diagnose Cluster” prompt that already contains the necessary system instructions and variable placeholders.

The Tech Stack

For this implementation, we utilized the following stack:

  • Language: Python (using the fastmcp library/SDK).
  • Dependency Management: uv (for fast installation and management).
  • Cluster Access: A local kubectl installation configured with access to the target cluster.
  • Transport: The server supports stdio (standard input/output for local agents) and http (for remote connections).

Step-by-Step Implementation

1. Setting up Resources (Context)

The first step in building the server is defining Resources. In a Kubernetes context, resources are ideal for information that doesn’t change frequently. We created a resource to list the available cluster contexts locally. This allows the agent to understand “where” it is operating without needing to run active commands constantly.

2. Building Tools (Actions)

Tools are the hands of the agent. We implemented a tool that wraps kubectl commands. This allows the agent to interact with the cluster directly.

A critical aspect of tool creation is Human-in-the-Loop verification. For read-only commands (like get pods), the agent can proceed freely. However, for mutation commands (like delete, apply, or create), we implemented an approval flow. The MCP server sends a request back to the client asking the user to confirm the action before execution. This prevents the LLM from accidentally deleting deployments.

We also added utility tools, such as a Base64 decoder, which is essential for investigating Kubernetes Secrets and ConfigMaps.

3. Designing Prompts (Workflows)

Prompts help guide the LLM. We created a prompt template called diagnose_cluster. This template accepts variables (like a namespace) and injects a specific system prompt that instructs the LLM on how to investigate issues, look for events, and analyze logs. This standardization ensures consistent results without the user needing to be a prompt engineering expert.

Production Readiness and Best Practices

Taking an MCP server from a prototype to production requires attention to observability and reliability.

  • Monitoring with OpenTelemetry: We wrapped our tool functions with a custom decorator that sends traces to an OpenTelemetry server. This allows us to monitor spans, catch errors, and debug issues in production.
  • Testing with MCP Inspector: Anthropic provides an “Inspector” tool that allows you to debug your server without an agent. You can test tool invocation and resource retrieval in an isolated environment.
  • Handling Hallucinations: The most effective way to reduce hallucinations is through detailed Tool Descriptions. The LLM relies entirely on the description field to understand how and when to use a tool. Iterating on these descriptions is the primary method for tuning performance.
  • LLM Sampling: This advanced feature allows the MCP server to act as a “micro-agent.” If a kubectl command fails, the server can feed the error back to the LLM and ask for a corrected command automatically, creating a self-healing loop.

Real-World Application

During the live demo, we connected this MCP server to Cursor and Claude Desktop. By simply typing “investigate issues in the namespace,” the agent used the tools to:

  1. Identify a crashing pod.
  2. Analyze the logs.
  3. Discover a misconfigured secret (base64 encoding issue).
  4. Suggest the exact code fix.
  5. Generate a React-based dashboard (Artifact) to visualize the cluster health.

Conclusion

MCP provides a powerful, standardized way to bring agentic AI to Kubernetes. By abstracting kubectl complexities behind natural language and enforcing safety checks like human approval, we can create tools that empower developers to troubleshoot and manage clusters more efficiently.

For those interested in the code, the repository is public and includes branches for each stage of the development process discussed above.