Building an AWS Solutions Architect Agent with Strands, LM Studio, and MCP Tools
Running AI agents locally has never been more accessible. In this tutorial, you’ll learn how to build a sophisticated AWS Solutions Architect agent that runs entirely on your machine using LM Studio, extended with powerful AWS documentation tools through the Model Context Protocol (MCP).
What You’ll Build
By the end of this guide, you’ll have:
- A conversational AI agent running on your local LM Studio instance
- Custom AWS architecture tools for VPC design, compute recommendations, and cost estimation
- Integration with AWS’s official MCP server for real-time documentation access
- A complete solutions architect assistant that can answer questions about AWS services, regional availability, and best practices
Prerequisites
Before we begin, make sure you have:
- Python 3.8 or higher installed
- [LM Studio](https://lmstudio.ai/) downloaded and installed
- A local LLM model loaded in LM Studio (I recommend Llama 3 or Mistral variants)
- Basic familiarity with Python and AWS concepts
Part 1: Setting Up Your First Strands Agent with LM Studio
Install Dependencies
First, create a new project directory and install the required packages:
pip install strands-agents openai
Create Your First Agent
Strands makes it incredibly simple to connect to LM Studio. Create a file called strands_agent.py:
from strands import Agent
from strands.models.openai import OpenAIModel
# Configure OpenAI-compatible model pointing to LM Studio
model = OpenAIModel(
client_args={
"api_key": "lm-studio", # LM Studio doesn't require a real key
"base_url": "http://localhost:1234/v1",
},
model_id="local-model", # LM Studio typically uses this identifier
params={
"max_tokens": 1000,
"temperature": 0.7,
}
)
# Create Strands agent with LM Studio as the LLM provider
agent = Agent(
name="LMStudioChatAgent",
system_prompt="You are a helpful AI assistant. Provide clear, concise, and accurate responses.",
model=model
)
def main():
print("Strands Agent with LM Studio")
print("=" * 50)
print("Make sure LM Studio is running on http://localhost:1234")
print("Type 'quit' to exit")
print("=" * 50)
while True:
user_input = input("\nYou: ").strip()
if user_input.lower() == 'quit':
print("Goodbye!")
break
if not user_input:
continue
# Run the agent with the user's message
response = agent(user_input)
print(f"\nAssistant: {response}")
if __name__ == "__main__":
main()
Start LM Studio
- Open LM Studio
- Load your preferred model (e.g., Llama 3.1 8B)
- Navigate to the “Local Server” tab
- Click “Start Server” (it will run on http://localhost:1234 by default)
Run Your Agent
python strands_agent.py
You now have a working conversational agent running entirely on your local machine! The beauty of Strands is that it automatically handles conversation history, so your agent maintains context across multiple turns.
Part 2: Extending to an AWS Solutions Architect
Now let’s transform this basic agent into a specialized AWS Solutions Architect by adding custom tools.
Understanding Strands Tools
Strands uses the @tool decorator to expose Python functions to your agent. The agent can intelligently decide when to call these functions based on user queries.
Create AWS Architecture Tools
Create aws_architect_agent.py with specialized AWS tools:
from strands import Agent, tool
from strands.models.openai import OpenAIModel
# Define AWS architecture tools
@tool
def design_vpc(cidr_block: str, availability_zones: int = 2) -> str:
"""Design a VPC architecture with subnets across availability zones.
Args:
cidr_block: The CIDR block for the VPC (e.g., "10.0.0.0/16")
availability_zones: Number of availability zones to use (default: 2)
"""
return f"""
VPC Architecture Design:
- VPC CIDR: {cidr_block}
- Availability Zones: {availability_zones}
- Public Subnets: {availability_zones} (one per AZ)
- Private Subnets: {availability_zones} (one per AZ)
- NAT Gateways: {availability_zones} (one per AZ for high availability)
- Internet Gateway: 1
- Route Tables: 2 (public and private)
Recommended subnet allocation:
{chr(10).join([f" AZ-{i+1} Public: {cidr_block.split('/')[0].rsplit('.', 1)[0]}.{i*16}.0/20" for i in range(availability_zones)])}
{chr(10).join([f" AZ-{i+1} Private: {cidr_block.split('/')[0].rsplit('.', 1)[0]}.{i*16+128}.0/20" for i in range(availability_zones)])}
"""
@tool
def recommend_compute(workload_type: str, expected_traffic: str) -> str:
"""Recommend compute services based on workload characteristics.
Args:
workload_type: Type of workload (e.g., "web", "batch", "microservices", "ml")
expected_traffic: Expected traffic pattern (e.g., "steady", "variable", "spiky", "unpredictable")
"""
recommendations = {
"web": {
"steady": "EC2 with Auto Scaling Group + Application Load Balancer",
"variable": "ECS Fargate with Application Load Balancer",
"spiky": "Lambda with API Gateway",
"unpredictable": "ECS Fargate with Auto Scaling"
},
"batch": {
"steady": "EC2 Spot Instances with AWS Batch",
"variable": "AWS Batch with Fargate",
"spiky": "Lambda or Step Functions",
"unpredictable": "AWS Batch with mixed instance types"
},
"microservices": {
"steady": "EKS with managed node groups",
"variable": "ECS Fargate",
"spiky": "Lambda with API Gateway",
"unpredictable": "EKS with Karpenter autoscaling"
},
"ml": {
"steady": "SageMaker endpoints with auto-scaling",
"variable": "SageMaker Serverless Inference",
"spiky": "Lambda with SageMaker async inference",
"unpredictable": "SageMaker with auto-scaling"
}
}
recommendation = recommendations.get(workload_type.lower(), {}).get(expected_traffic.lower(),
"Please provide valid workload_type and expected_traffic")
return f"Recommended compute service for {workload_type} workload with {expected_traffic} traffic: {recommendation}"
@tool
def design_database(data_type: str, scale: str, consistency_requirement: str = "strong") -> str:
"""Recommend database services based on data characteristics.
Args:
data_type: Type of data (e.g., "relational", "document", "key-value", "graph", "timeseries")
scale: Expected scale (e.g., "small", "medium", "large", "massive")
consistency_requirement: Consistency requirement (e.g., "strong", "eventual")
"""
recommendations = {
"relational": {
"small": "RDS (MySQL/PostgreSQL) - Single AZ",
"medium": "RDS (MySQL/PostgreSQL) - Multi-AZ",
"large": "Aurora PostgreSQL with read replicas",
"massive": "Aurora PostgreSQL Global Database"
},
"document": {
"small": "DocumentDB (single instance)",
"medium": "DocumentDB (replica set)",
"large": "DocumentDB (sharded cluster)",
"massive": "DynamoDB with on-demand capacity"
},
"key-value": {
"small": "ElastiCache Redis (single node)",
"medium": "ElastiCache Redis (cluster mode disabled)",
"large": "ElastiCache Redis (cluster mode enabled)",
"massive": "DynamoDB with DAX"
},
"graph": {
"small": "Neptune (single instance)",
"medium": "Neptune (with read replicas)",
"large": "Neptune (multi-region)",
"massive": "Neptune (global database)"
},
"timeseries": {
"small": "RDS PostgreSQL with TimescaleDB",
"medium": "Timestream",
"large": "Timestream with data tiering",
"massive": "Timestream + S3 for long-term storage"
}
}
db_service = recommendations.get(data_type.lower(), {}).get(scale.lower(), "Invalid data_type or scale")
consistency_note = ""
if consistency_requirement.lower() == "eventual" and data_type.lower() in ["document", "key-value"]:
consistency_note = "\nNote: Consider DynamoDB for eventual consistency requirements with global tables for multi-region."
return f"Recommended database: {db_service}{consistency_note}"
@tool
def security_best_practices(resource_type: str) -> str:
"""Provide security best practices for AWS resources.
Args:
resource_type: Type of AWS resource (e.g., "s3", "ec2", "rds", "lambda", "vpc")
"""
practices = {
"s3": """
S3 Security Best Practices:
1. Enable bucket encryption (SSE-S3 or SSE-KMS)
2. Block public access unless explicitly needed
3. Enable versioning for data protection
4. Use bucket policies with least privilege
5. Enable access logging
6. Enable MFA Delete for critical buckets
7. Use VPC endpoints for private access
""",
"ec2": """
EC2 Security Best Practices:
1. Use Systems Manager Session Manager instead of SSH
2. Keep AMIs and software up to date
3. Use security groups with least privilege (no 0.0.0.0/0 for SSH)
4. Enable detailed monitoring
5. Use IAM roles instead of access keys
6. Encrypt EBS volumes
7. Use AWS Systems Manager for patch management
""",
"rds": """
RDS Security Best Practices:
1. Enable encryption at rest
2. Use SSL/TLS for connections
3. Place in private subnets
4. Use security groups to restrict access
5. Enable automated backups
6. Enable Multi-AZ for production
7. Use IAM database authentication
8. Enable Enhanced Monitoring
"""
}
return practices.get(resource_type.lower(), f"Security best practices not available for {resource_type}")
# Configure the model
model = OpenAIModel(
client_args={
"api_key": "lm-studio",
"base_url": "http://localhost:1234/v1",
},
model_id="local-model",
params={
"max_tokens": 2000,
"temperature": 0.7,
}
)
# Create AWS Architect agent with tools
agent = Agent(
name="AWSArchitect",
system_prompt="""You are an expert AWS Solutions Architect with deep knowledge of AWS services,
best practices, and cost optimization. You help design scalable, secure, and cost-effective
cloud architectures.
When designing solutions:
1. Always consider high availability and fault tolerance
2. Follow the AWS Well-Architected Framework pillars
3. Recommend appropriate services based on workload characteristics
4. Consider cost optimization opportunities
5. Emphasize security best practices
6. Think about scalability and future growth
Use your tools to provide specific recommendations and designs.""",
model=model,
tools=[
design_vpc,
recommend_compute,
design_database,
security_best_practices
]
)
def main():
print("AWS Solutions Architect Agent")
print("=" * 60)
print("Ask me about AWS architecture, best practices, and design!")
print("Type 'quit' to exit")
print("=" * 60)
while True:
user_input = input("\nYou: ").strip()
if user_input.lower() == 'quit':
print("Goodbye!")
break
if not user_input:
continue
response = agent(user_input)
print(f"\nAWS Architect: {response}")
if __name__ == "__main__":
main()
Try It Out
Run your enhanced agent:
python aws_architect_agent.py
Now you can ask questions like:
- “Design a VPC with CIDR 10.0.0.0/16 across 3 availability zones”
- “What compute service should I use for a microservices workload with spiky traffic?”
- “Recommend a database for large-scale relational data”
- “What are the security best practices for S3?”
The agent will intelligently call the appropriate tools and provide detailed recommendations.
Part 3: Adding AWS MCP Tools for Real-Time Documentation
While custom tools are powerful, wouldn’t it be great to have access to AWS’s official documentation in real-time? This is where the Model Context Protocol (MCP) shines.
What is MCP?
Model Context Protocol is an open standard that allows AI agents to connect to external tool servers. Instead of hardcoding tools, you can dynamically load them from MCP servers maintained by AWS, Anthropic, and the community.
Install MCP Dependencies
First, install the required packages:
pip install mcp strands-agents
You’ll also need uv, a Python package manager that MCP uses:
macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
Windows:
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
Or with pip:
pip install uv
Create the MCP-Enhanced Agent
Create aws_architect_mcp_simple.py:
from mcp import stdio_client, StdioServerParameters
from strands import Agent
from strands.models.openai import OpenAIModel
from strands.tools.mcp import MCPClient
# Configure OpenAI-compatible model pointing to LM Studio
model = OpenAIModel(
client_args={
"api_key": "lm-studio",
"base_url": "http://localhost:1234/v1",
},
model_id="local-model",
params={
"max_tokens": 2000,
"temperature": 0.7,
}
)
# Create MCP client for AWS Knowledge
aws_knowledge_client = MCPClient(
lambda: stdio_client(
StdioServerParameters(
command="uvx",
args=["fastmcp", "run", "https://knowledge-mcp.global.api.aws"]
)
),
prefix="aws"
)
# Create agent with MCP client
agent = Agent(
name="AWSArchitect",
system_prompt="""You are an expert AWS Solutions Architect with access to AWS documentation.
Use your tools to:
- Search AWS documentation for current information
- Check service availability in different regions
- Provide architecture recommendations based on AWS best practices
- Always cite your sources from AWS documentation""",
model=model,
tools=[aws_knowledge_client]
)
def main():
print("AWS Solutions Architect Agent (with MCP)")
print("=" * 60)
print("Connected to AWS Knowledge MCP Server")
print("Type 'quit' to exit")
print("=" * 60)
while True:
user_input = input("\nYou: ").strip()
if user_input.lower() == 'quit':
print("Goodbye!")
break
if not user_input:
continue
try:
response = agent(user_input)
print(f"\nAWS Architect: {response}")
except Exception as e:
print(f"\nError: {e}")
if __name__ == "__main__":
main()
What MCP Tools Are Available?
The AWS Knowledge MCP server provides several powerful tools:
- search_documentation: Search AWS documentation for specific topics
- read_documentation: Read specific AWS documentation pages
- get_regional_availability: Check if services are available in specific AWS regions
- list_regions: Get a list of all AWS regions
- recommend: Get content recommendations based on documentation pages
Run Your MCP-Enhanced Agent
python aws_architect_mcp_simple.py
Now you can ask questions like:
- “What is AWS Lambda and when should I use it?”
- “Is Amazon ECS available in eu-west-1?”
- “What are the latest features in Amazon S3?”
- “How do I set up a highly available web application?”
The agent will search AWS’s official documentation and provide accurate, up-to-date answers with citations!
Part 4: Combining Custom Tools with MCP
For the ultimate AWS architect agent, you can combine your custom tools with MCP tools. Here’s how:
from mcp import stdio_client, StdioServerParameters
from strands import Agent, tool
from strands.models.openai import OpenAIModel
from strands.tools.mcp import MCPClient
# Your custom tools
@tool
def design_vpc(cidr_block: str, availability_zones: int = 2) -> str:
"""Design a VPC architecture with subnets across availability zones."""
# ... implementation ...
@tool
def recommend_compute(workload_type: str, expected_traffic: str) -> str:
"""Recommend compute services based on workload characteristics."""
# ... implementation ...
# Configure model
model = OpenAIModel(
client_args={
"api_key": "lm-studio",
"base_url": "http://localhost:1234/v1",
},
model_id="local-model",
params={
"max_tokens": 2000,
"temperature": 0.7,
}
)
# Create MCP client
aws_knowledge_client = MCPClient(
lambda: stdio_client(
StdioServerParameters(
command="uvx",
args=["fastmcp", "run", "https://knowledge-mcp.global.api.aws"]
)
),
prefix="aws"
)
# Combine custom tools with MCP tools
agent = Agent(
name="AWSArchitect",
system_prompt="""You are an expert AWS Solutions Architect with both custom design tools
and access to AWS documentation. Use your custom tools for architecture design and MCP tools
for documentation lookup.""",
model=model,
tools=[
design_vpc,
recommend_compute,
aws_knowledge_client # MCP tools loaded dynamically
]
)
This gives you the best of both worlds: custom logic for architecture patterns and real-time access to AWS documentation.
Best Practices and Tips
1. Choose the Right Model
For AWS architecture tasks, I recommend:
- Llama 3.1 8B: Great balance of speed and capability
- Mistral 7B: Excellent for technical tasks
- Qwen 2.5 14B: Superior reasoning for complex architectures
2. Optimize Your System Prompt
The system prompt is crucial for agent behavior. Be specific about:
- The agent’s role and expertise
- When to use which tools
- How to format responses
- What to prioritize (security, cost, performance)
3. Handle Errors Gracefully
Always wrap agent calls in try-except blocks:
try:
response = agent(user_input)
print(f"\nAWS Architect: {response}")
except Exception as e:
print(f"\nError: {e}")
print("Please try rephrasing your question.")
4. Use Tool Filtering
If you only need specific MCP tools, filter them:
import re
aws_client = MCPClient(
lambda: stdio_client(...),
tool_filters={
"allowed": [re.compile(r"^search_.*"), "get_regional_availability"]
}
)
5. Monitor Token Usage
LM Studio shows token usage in real-time. If responses are slow:
- Reduce max_tokens in model params
- Use smaller models for simpler queries
- Implement streaming responses
Troubleshooting Common Issues
LM Studio Connection Failed
- Ensure LM Studio server is running on port 1234
- Check that a model is loaded
- Verify the base URL in your code matches LM Studio’s server address
MCP Tools Not Working
- Verify uv is installed: uvx –version
- Check internet connectivity (MCP servers need to download on first run)
- Look for error messages in the console
Agent Not Using Tools
- Make sure tool docstrings are clear and descriptive
- Check that the system prompt encourages tool usage
- Try more explicit questions that clearly require tool usage
Slow Responses
- Use smaller models (7B instead of 13B+)
- Reduce max_tokens parameter
- Limit the number of tools available to the agent
- Consider using GPU acceleration in LM Studio
Conclusion
You’ve now built a sophisticated AWS Solutions Architect agent that:
- Runs entirely on your local machine with LM Studio
- Has custom tools for architecture design and recommendations
- Connects to AWS’s official documentation through MCP
- Can answer complex questions about AWS services and best practices
This is just the beginning. You can extend this further by:
- Adding more custom tools for cost calculation, security auditing, or compliance checking
- Integrating additional MCP servers (Strands documentation, GitHub, etc.)
- Building a web interface with Streamlit or Gradio
- Creating specialized agents for different AWS domains (networking, security, data engineering)
The combination of Strands, LM Studio, and MCP creates a powerful, flexible, and privacy-respecting AI agent platform. All your data stays local, you have full control over the model and tools, and you can customize everything to your needs.
Resources
- Strands Agents Documentation: https://strandsagents.com
- LM Studio: https://lmstudio.ai/
- Model Context Protocol: https://modelcontextprotocol.io
- AWS Knowledge MCP Server: https://awslabs.github.io/mcp/servers/aws-knowledge-mcp-server/
- uv Installation Guide: https://docs.astral.sh/uv/getting-started/installation/
Happy building! If you create something cool with this setup, I’d love to hear about it in the comments below.