Building an AWS Solutions Architect Agent with Strands, LM Studio, and MCP Tools

Running AI agents locally has never been more accessible. In this tutorial, you’ll learn how to build a sophisticated AWS Solutions Architect agent that runs entirely on your machine using LM Studio, extended with powerful AWS documentation tools through the Model Context Protocol (MCP).

What You’ll Build

By the end of this guide, you’ll have:

A conversational AI agent running on your local LM Studio instance
Custom AWS architecture tools for VPC design, compute recommendations, and cost estimation
Integration with AWS’s official MCP server for real-time documentation access
A complete solutions architect assistant that can answer questions about AWS services, regional availability, and best practices

Prerequisites

Before we begin, make sure you have:

Python 3.8 or higher installed
[LM Studio](https://lmstudio.ai/) downloaded and installed
A local LLM model loaded in LM Studio (I recommend Llama 3 or Mistral variants)
Basic familiarity with Python and AWS concepts

Part 1: Setting Up Your First Strands Agent with LM Studio

Install Dependencies

First, create a new project directory and install the required packages:

pip install strands-agents openai

Create Your First Agent

Strands makes it incredibly simple to connect to LM Studio. Create a file called strands_agent.py:

from strands import Agent
from strands.models.openai import OpenAIModel

# Configure OpenAI-compatible model pointing to LM Studio
model = OpenAIModel(
    client_args={
        "api_key": "lm-studio",  # LM Studio doesn't require a real key
        "base_url": "http://localhost:1234/v1",
    },
    model_id="local-model",  # LM Studio typically uses this identifier
    params={
        "max_tokens": 1000,
        "temperature": 0.7,
    }
)

# Create Strands agent with LM Studio as the LLM provider
agent = Agent(
    name="LMStudioChatAgent",
    system_prompt="You are a helpful AI assistant. Provide clear, concise, and accurate responses.",
    model=model
)

def main():
    print("Strands Agent with LM Studio")
    print("=" * 50)
    print("Make sure LM Studio is running on http://localhost:1234")
    print("Type 'quit' to exit")
    print("=" * 50)
   
    while True:
        user_input = input("\nYou: ").strip()
       
        if user_input.lower() == 'quit':
            print("Goodbye!")
            break
       
        if not user_input:
            continue
       
        # Run the agent with the user's message
        response = agent(user_input)
        print(f"\nAssistant: {response}")


if __name__ == "__main__":
    main()

Start LM Studio

Open LM Studio
Load your preferred model (e.g., Llama 3.1 8B)
Navigate to the “Local Server” tab
Click “Start Server” (it will run on http://localhost:1234 by default)

Run Your Agent

python strands_agent.py

You now have a working conversational agent running entirely on your local machine! The beauty of Strands is that it automatically handles conversation history, so your agent maintains context across multiple turns.

Part 2: Extending to an AWS Solutions Architect

Now let’s transform this basic agent into a specialized AWS Solutions Architect by adding custom tools.

Understanding Strands Tools

Strands uses the @tool decorator to expose Python functions to your agent. The agent can intelligently decide when to call these functions based on user queries.

Create AWS Architecture Tools

Create aws_architect_agent.py with specialized AWS tools:

from strands import Agent, tool
from strands.models.openai import OpenAIModel

# Define AWS architecture tools
@tool
def design_vpc(cidr_block: str, availability_zones: int = 2) -> str:
    """Design a VPC architecture with subnets across availability zones.
   
    Args:
        cidr_block: The CIDR block for the VPC (e.g., "10.0.0.0/16")
        availability_zones: Number of availability zones to use (default: 2)
    """
    return f"""
VPC Architecture Design:
- VPC CIDR: {cidr_block}
- Availability Zones: {availability_zones}
- Public Subnets: {availability_zones} (one per AZ)
- Private Subnets: {availability_zones} (one per AZ)
- NAT Gateways: {availability_zones} (one per AZ for high availability)
- Internet Gateway: 1
- Route Tables: 2 (public and private)

Recommended subnet allocation:
{chr(10).join([f"  AZ-{i+1} Public: {cidr_block.split('/')[0].rsplit('.', 1)[0]}.{i*16}.0/20" for i in range(availability_zones)])}
{chr(10).join([f"  AZ-{i+1} Private: {cidr_block.split('/')[0].rsplit('.', 1)[0]}.{i*16+128}.0/20" for i in range(availability_zones)])}
"""

@tool
def recommend_compute(workload_type: str, expected_traffic: str) -> str:
    """Recommend compute services based on workload characteristics.
   
    Args:
        workload_type: Type of workload (e.g., "web", "batch", "microservices", "ml")
        expected_traffic: Expected traffic pattern (e.g., "steady", "variable", "spiky", "unpredictable")
    """
    recommendations = {
        "web": {
            "steady": "EC2 with Auto Scaling Group + Application Load Balancer",
            "variable": "ECS Fargate with Application Load Balancer",
            "spiky": "Lambda with API Gateway",
            "unpredictable": "ECS Fargate with Auto Scaling"
        },
        "batch": {
            "steady": "EC2 Spot Instances with AWS Batch",
            "variable": "AWS Batch with Fargate",
            "spiky": "Lambda or Step Functions",
            "unpredictable": "AWS Batch with mixed instance types"
        },
        "microservices": {
            "steady": "EKS with managed node groups",
            "variable": "ECS Fargate",
            "spiky": "Lambda with API Gateway",
            "unpredictable": "EKS with Karpenter autoscaling"
        },
        "ml": {
            "steady": "SageMaker endpoints with auto-scaling",
            "variable": "SageMaker Serverless Inference",
            "spiky": "Lambda with SageMaker async inference",
            "unpredictable": "SageMaker with auto-scaling"
        }
    }
   
    recommendation = recommendations.get(workload_type.lower(), {}).get(expected_traffic.lower(),
                                                                         "Please provide valid workload_type and expected_traffic")
   
    return f"Recommended compute service for {workload_type} workload with {expected_traffic} traffic: {recommendation}"

@tool
def design_database(data_type: str, scale: str, consistency_requirement: str = "strong") -> str:
    """Recommend database services based on data characteristics.
   
    Args:
        data_type: Type of data (e.g., "relational", "document", "key-value", "graph", "timeseries")
        scale: Expected scale (e.g., "small", "medium", "large", "massive")
        consistency_requirement: Consistency requirement (e.g., "strong", "eventual")
    """
    recommendations = {
        "relational": {
            "small": "RDS (MySQL/PostgreSQL) - Single AZ",
            "medium": "RDS (MySQL/PostgreSQL) - Multi-AZ",
            "large": "Aurora PostgreSQL with read replicas",
            "massive": "Aurora PostgreSQL Global Database"
        },
        "document": {
            "small": "DocumentDB (single instance)",
            "medium": "DocumentDB (replica set)",
            "large": "DocumentDB (sharded cluster)",
            "massive": "DynamoDB with on-demand capacity"
        },
        "key-value": {
            "small": "ElastiCache Redis (single node)",
            "medium": "ElastiCache Redis (cluster mode disabled)",
            "large": "ElastiCache Redis (cluster mode enabled)",
            "massive": "DynamoDB with DAX"
        },
        "graph": {
            "small": "Neptune (single instance)",
            "medium": "Neptune (with read replicas)",
            "large": "Neptune (multi-region)",
            "massive": "Neptune (global database)"
        },
        "timeseries": {
            "small": "RDS PostgreSQL with TimescaleDB",
            "medium": "Timestream",
            "large": "Timestream with data tiering",
            "massive": "Timestream + S3 for long-term storage"
        }
    }
   
    db_service = recommendations.get(data_type.lower(), {}).get(scale.lower(), "Invalid data_type or scale")
   
    consistency_note = ""
    if consistency_requirement.lower() == "eventual" and data_type.lower() in ["document", "key-value"]:
        consistency_note = "\nNote: Consider DynamoDB for eventual consistency requirements with global tables for multi-region."
   
    return f"Recommended database: {db_service}{consistency_note}"

@tool
def security_best_practices(resource_type: str) -> str:
    """Provide security best practices for AWS resources.
   
    Args:
        resource_type: Type of AWS resource (e.g., "s3", "ec2", "rds", "lambda", "vpc")
    """
    practices = {
        "s3": """
S3 Security Best Practices:
1. Enable bucket encryption (SSE-S3 or SSE-KMS)
2. Block public access unless explicitly needed
3. Enable versioning for data protection
4. Use bucket policies with least privilege
5. Enable access logging
6. Enable MFA Delete for critical buckets
7. Use VPC endpoints for private access
""",
        "ec2": """
EC2 Security Best Practices:
1. Use Systems Manager Session Manager instead of SSH
2. Keep AMIs and software up to date
3. Use security groups with least privilege (no 0.0.0.0/0 for SSH)
4. Enable detailed monitoring
5. Use IAM roles instead of access keys
6. Encrypt EBS volumes
7. Use AWS Systems Manager for patch management
""",
        "rds": """
RDS Security Best Practices:
1. Enable encryption at rest
2. Use SSL/TLS for connections
3. Place in private subnets
4. Use security groups to restrict access
5. Enable automated backups
6. Enable Multi-AZ for production
7. Use IAM database authentication
8. Enable Enhanced Monitoring
"""
    }
   
    return practices.get(resource_type.lower(), f"Security best practices not available for {resource_type}")


# Configure the model
model = OpenAIModel(
    client_args={
        "api_key": "lm-studio",
        "base_url": "http://localhost:1234/v1",
    },
    model_id="local-model",
    params={
        "max_tokens": 2000,
        "temperature": 0.7,
    }
)

# Create AWS Architect agent with tools
agent = Agent(
    name="AWSArchitect",
    system_prompt="""You are an expert AWS Solutions Architect with deep knowledge of AWS services,
best practices, and cost optimization. You help design scalable, secure, and cost-effective
cloud architectures.

When designing solutions:
1. Always consider high availability and fault tolerance
2. Follow the AWS Well-Architected Framework pillars
3. Recommend appropriate services based on workload characteristics
4. Consider cost optimization opportunities
5. Emphasize security best practices
6. Think about scalability and future growth

Use your tools to provide specific recommendations and designs.""",
    model=model,
    tools=[
        design_vpc,
        recommend_compute,
        design_database,
        security_best_practices
    ]
)


def main():
    print("AWS Solutions Architect Agent")
    print("=" * 60)
    print("Ask me about AWS architecture, best practices, and design!")
    print("Type 'quit' to exit")
    print("=" * 60)
   
    while True:
        user_input = input("\nYou: ").strip()
       
        if user_input.lower() == 'quit':
            print("Goodbye!")
            break
       
        if not user_input:
            continue
       
        response = agent(user_input)
        print(f"\nAWS Architect: {response}")


if __name__ == "__main__":
    main()

Try It Out

Run your enhanced agent:

python aws_architect_agent.py

Now you can ask questions like:

“Design a VPC with CIDR 10.0.0.0/16 across 3 availability zones”
“What compute service should I use for a microservices workload with spiky traffic?”
“Recommend a database for large-scale relational data”
“What are the security best practices for S3?”

The agent will intelligently call the appropriate tools and provide detailed recommendations.

Part 3: Adding AWS MCP Tools for Real-Time Documentation

While custom tools are powerful, wouldn’t it be great to have access to AWS’s official documentation in real-time? This is where the Model Context Protocol (MCP) shines.

What is MCP?

Model Context Protocol is an open standard that allows AI agents to connect to external tool servers. Instead of hardcoding tools, you can dynamically load them from MCP servers maintained by AWS, Anthropic, and the community.

Install MCP Dependencies

First, install the required packages:

pip install mcp strands-agents

You’ll also need uv, a Python package manager that MCP uses:

macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Or with pip:

pip install uv

Create the MCP-Enhanced Agent

Create aws_architect_mcp_simple.py:

from mcp import stdio_client, StdioServerParameters
from strands import Agent
from strands.models.openai import OpenAIModel
from strands.tools.mcp import MCPClient

# Configure OpenAI-compatible model pointing to LM Studio
model = OpenAIModel(
    client_args={
        "api_key": "lm-studio",
        "base_url": "http://localhost:1234/v1",
    },
    model_id="local-model",
    params={
        "max_tokens": 2000,
        "temperature": 0.7,
    }
)

# Create MCP client for AWS Knowledge
aws_knowledge_client = MCPClient(
    lambda: stdio_client(
        StdioServerParameters(
            command="uvx",
            args=["fastmcp", "run", "https://knowledge-mcp.global.api.aws"]
        )
    ),
    prefix="aws"
)

# Create agent with MCP client
agent = Agent(
    name="AWSArchitect",
    system_prompt="""You are an expert AWS Solutions Architect with access to AWS documentation.
   
Use your tools to:
- Search AWS documentation for current information
- Check service availability in different regions
- Provide architecture recommendations based on AWS best practices
- Always cite your sources from AWS documentation""",
    model=model,
    tools=[aws_knowledge_client]
)


def main():
    print("AWS Solutions Architect Agent (with MCP)")
    print("=" * 60)
    print("Connected to AWS Knowledge MCP Server")
    print("Type 'quit' to exit")
    print("=" * 60)
   
    while True:
        user_input = input("\nYou: ").strip()
       
        if user_input.lower() == 'quit':
            print("Goodbye!")
            break
       
        if not user_input:
            continue
       
        try:
            response = agent(user_input)
            print(f"\nAWS Architect: {response}")
        except Exception as e:
            print(f"\nError: {e}")


if __name__ == "__main__":
    main()

What MCP Tools Are Available?

The AWS Knowledge MCP server provides several powerful tools:

search_documentation: Search AWS documentation for specific topics
read_documentation: Read specific AWS documentation pages
get_regional_availability: Check if services are available in specific AWS regions
list_regions: Get a list of all AWS regions
recommend: Get content recommendations based on documentation pages

Run Your MCP-Enhanced Agent

python aws_architect_mcp_simple.py

Now you can ask questions like:

“What is AWS Lambda and when should I use it?”
“Is Amazon ECS available in eu-west-1?”
“What are the latest features in Amazon S3?”
“How do I set up a highly available web application?”

The agent will search AWS’s official documentation and provide accurate, up-to-date answers with citations!

Part 4: Combining Custom Tools with MCP

For the ultimate AWS architect agent, you can combine your custom tools with MCP tools. Here’s how:

from mcp import stdio_client, StdioServerParameters
from strands import Agent, tool
from strands.models.openai import OpenAIModel
from strands.tools.mcp import MCPClient

# Your custom tools
@tool
def design_vpc(cidr_block: str, availability_zones: int = 2) -> str:
    """Design a VPC architecture with subnets across availability zones."""
    # ... implementation ...

@tool
def recommend_compute(workload_type: str, expected_traffic: str) -> str:
    """Recommend compute services based on workload characteristics."""
    # ... implementation ...

# Configure model
model = OpenAIModel(
    client_args={
        "api_key": "lm-studio",
        "base_url": "http://localhost:1234/v1",
    },
    model_id="local-model",
    params={
        "max_tokens": 2000,
        "temperature": 0.7,
    }
)

# Create MCP client
aws_knowledge_client = MCPClient(
    lambda: stdio_client(
        StdioServerParameters(
            command="uvx",
            args=["fastmcp", "run", "https://knowledge-mcp.global.api.aws"]
        )
    ),
    prefix="aws"
)

# Combine custom tools with MCP tools
agent = Agent(
    name="AWSArchitect",
    system_prompt="""You are an expert AWS Solutions Architect with both custom design tools
and access to AWS documentation. Use your custom tools for architecture design and MCP tools
for documentation lookup.""",
    model=model,
    tools=[
        design_vpc,
        recommend_compute,
        aws_knowledge_client  # MCP tools loaded dynamically
    ]
)

This gives you the best of both worlds: custom logic for architecture patterns and real-time access to AWS documentation.

Best Practices and Tips

1. Choose the Right Model

For AWS architecture tasks, I recommend:

Llama 3.1 8B: Great balance of speed and capability
Mistral 7B: Excellent for technical tasks
Qwen 2.5 14B: Superior reasoning for complex architectures

2. Optimize Your System Prompt

The system prompt is crucial for agent behavior. Be specific about:

The agent’s role and expertise
When to use which tools
How to format responses
What to prioritize (security, cost, performance)

3. Handle Errors Gracefully

Always wrap agent calls in try-except blocks:

try:
    response = agent(user_input)
    print(f"\nAWS Architect: {response}")
except Exception as e:
    print(f"\nError: {e}")
    print("Please try rephrasing your question.")

4. Use Tool Filtering

If you only need specific MCP tools, filter them:

import re

aws_client = MCPClient(
    lambda: stdio_client(...),
    tool_filters={
        "allowed": [re.compile(r"^search_.*"), "get_regional_availability"]
    }
)

5. Monitor Token Usage

LM Studio shows token usage in real-time. If responses are slow:

Reduce max_tokens in model params
Use smaller models for simpler queries
Implement streaming responses

Troubleshooting Common Issues

LM Studio Connection Failed

Ensure LM Studio server is running on port 1234
Check that a model is loaded
Verify the base URL in your code matches LM Studio’s server address

MCP Tools Not Working

Verify uv is installed: uvx –version
Check internet connectivity (MCP servers need to download on first run)
Look for error messages in the console

Agent Not Using Tools

Make sure tool docstrings are clear and descriptive
Check that the system prompt encourages tool usage
Try more explicit questions that clearly require tool usage

Slow Responses

Use smaller models (7B instead of 13B+)
Reduce max_tokens parameter
Limit the number of tools available to the agent
Consider using GPU acceleration in LM Studio

Conclusion

You’ve now built a sophisticated AWS Solutions Architect agent that:

Runs entirely on your local machine with LM Studio
Has custom tools for architecture design and recommendations
Connects to AWS’s official documentation through MCP
Can answer complex questions about AWS services and best practices

This is just the beginning. You can extend this further by:

Adding more custom tools for cost calculation, security auditing, or compliance checking
Integrating additional MCP servers (Strands documentation, GitHub, etc.)
Building a web interface with Streamlit or Gradio
Creating specialized agents for different AWS domains (networking, security, data engineering)

The combination of Strands, LM Studio, and MCP creates a powerful, flexible, and privacy-respecting AI agent platform. All your data stays local, you have full control over the model and tools, and you can customize everything to your needs.

Resources

Strands Agents Documentation: https://strandsagents.com
LM Studio: https://lmstudio.ai/
Model Context Protocol: https://modelcontextprotocol.io
AWS Knowledge MCP Server: https://awslabs.github.io/mcp/servers/aws-knowledge-mcp-server/
uv Installation Guide: https://docs.astral.sh/uv/getting-started/installation/

Happy building! If you create something cool with this setup, I’d love to hear about it in the comments below.