Multi-Environment Setup
Configure and manage Agent.md across development, staging, and production environments.
Overview
Multi-environment setup enables:
- Safe testing - Test changes in dev/staging before production
- Isolation - Each environment has separate data, secrets, and configs
- Cost control - Use cheaper models in dev, production models in prod
- Security - Production secrets never exposed to dev
- Debugging - Test with verbose logging in dev, silent in prod
Environment Strategy
Three-Environment Setup
Development (dev)
↓ Test and iterate
Staging (staging)
↓ Final validation
Production (prod)
↓ Live agents
Characteristics:
| Environment | Purpose | LLM Model | Schedule | Logging | Cost |
|---|---|---|---|---|---|
| Dev | Rapid iteration | Cheap/local | Manual only | Verbose | Low |
| Staging | Pre-prod testing | Same as prod | Less frequent | Normal | Medium |
| Prod | Live operations | Best quality | As needed | Silent | High |
Directory Structure
Workspace Organization
agentmd/
├── .env.dev # Dev environment vars
├── .env.staging # Staging environment vars
├── .env.production # Production environment vars (gitignored)
├── .env.example # Template (committed)
│
├── workspace/
│ ├── dev/ # Development agents
│ │ ├── test-agent.md
│ │ └── experiment.md
│ ├── staging/ # Staging agents
│ │ ├── daily-report.md
│ │ └── log-monitor.md
│ └── production/ # Production agents
│ ├── daily-report.md
│ └── log-monitor.md
│
├── output/
│ ├── dev/ # Dev outputs
│ ├── staging/ # Staging outputs
│ └── production/ # Production outputs
│
└── data/
├── dev/ # Dev databases
├── staging/ # Staging databases
└── production/ # Production databases
Git Configuration
.gitignore:
# Environment files with secrets
.env.dev
.env.staging
.env.production
# Keep example template
!.env.example
# Environment-specific outputs
output/dev/
output/staging/
output/production/
# Environment-specific data
data/dev/
data/staging/
data/production/
.env.example (committed):
# LLM Provider
OPENAI_API_KEY=sk-proj-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Agent.md Runtime
WORKSPACE_DIR=/workspace/production
OUTPUT_DIR=/output/production
DATABASE_PATH=/data/production/executions.db
# Environment
ENVIRONMENT=production
LOG_LEVEL=INFO
Environment Configuration Files
Development (.env.dev)
# Environment
ENVIRONMENT=dev
LOG_LEVEL=DEBUG
# Paths
WORKSPACE_DIR=/workspace/dev
OUTPUT_DIR=/output/dev
DATABASE_PATH=/data/dev/executions.db
# LLM - Use cheap/local models
OPENAI_API_KEY=sk-proj-dev-key
DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=llama3.1:8b
# External services - Point to dev/mock APIs
API_BASE_URL=http://localhost:3000
SLACK_WEBHOOK=https://hooks.slack.com/services/DEV/WEBHOOK
# Limits - Generous for testing
MAX_ITERATIONS=20
TIMEOUT=600
MAX_TOKENS=4000
# Cost control
ENABLE_EXPENSIVE_MODELS=false
Staging (.env.staging)
# Environment
ENVIRONMENT=staging
LOG_LEVEL=INFO
# Paths
WORKSPACE_DIR=/workspace/staging
OUTPUT_DIR=/output/staging
DATABASE_PATH=/data/staging/executions.db
# LLM - Use production models but with reduced usage
OPENAI_API_KEY=sk-proj-staging-key
DEFAULT_PROVIDER=openai
DEFAULT_MODEL=gpt-3.5-turbo
# External services - Point to staging APIs
API_BASE_URL=https://staging-api.example.com
SLACK_WEBHOOK=https://hooks.slack.com/services/STAGING/WEBHOOK
# Limits - Same as prod
MAX_ITERATIONS=10
TIMEOUT=300
MAX_TOKENS=2000
# Cost control
ENABLE_EXPENSIVE_MODELS=true
Production (.env.production)
# Environment
ENVIRONMENT=production
LOG_LEVEL=WARNING
# Paths
WORKSPACE_DIR=/workspace/production
OUTPUT_DIR=/output/production
DATABASE_PATH=/data/production/executions.db
# LLM - Use production keys and models
OPENAI_API_KEY=sk-proj-prod-key-REAL-SECRET
ANTHROPIC_API_KEY=sk-ant-prod-key-REAL-SECRET
DEFAULT_PROVIDER=openai
DEFAULT_MODEL=gpt-4
# External services - Live APIs
API_BASE_URL=https://api.example.com
SLACK_WEBHOOK=https://hooks.slack.com/services/PROD/WEBHOOK
DATABASE_URL=postgresql://prod-user:REAL-SECRET@prod-db:5432/agents
# Limits - Strict
MAX_ITERATIONS=5
TIMEOUT=180
MAX_TOKENS=1000
# Security
ENABLE_EXPENSIVE_MODELS=true
REQUIRE_APPROVAL=true
Environment-Specific Agent Configurations
Development Agent
workspace/dev/test-agent.md:
---
name: test-agent
model:
provider: ollama
name: llama3.1:8b # Free local model
temperature: 0.5
max_tokens: 4000 # High limit for testing
settings:
max_iterations: 20 # Allow more iterations for debugging
timeout: 600 # 10 minute timeout
log_level: DEBUG # Verbose logging
paths:
- /workspace/dev/data/
- /output/dev/
triggers:
- type: manual # Only run manually in dev
---
Test agent for development. This can have verbose instructions
and detailed logging. Cost doesn't matter here.
Use this to test new features, iterate on prompts, and debug issues.
Staging Agent
workspace/staging/daily-report.md:
---
name: daily-report
model:
provider: openai
name: gpt-3.5-turbo # Cheaper than prod but still cloud-based
temperature: 0.3
max_tokens: 1500
settings:
max_iterations: 10
timeout: 300
log_level: INFO
paths:
- /workspace/staging/data/
- /output/staging/reports/
triggers:
- type: schedule
cron: "0 10 * * *" # Run once daily in staging (different time than prod)
---
Generate daily report. This is the staging version - same prompt
as production, but running against staging data and APIs.
Production Agent
workspace/production/daily-report.md:
---
name: daily-report
model:
provider: openai
name: gpt-4 # Best quality for production
temperature: 0.1
max_tokens: 1000 # Strict limit
settings:
max_iterations: 5
timeout: 180
log_level: WARNING # Only log warnings/errors
paths:
- /workspace/production/data/
- /output/production/reports/
triggers:
- type: schedule
cron: "0 8 * * *" # 8 AM daily in production
---
Generate daily report. Concise prompt, optimized for cost and reliability.
CLI Environment Overrides
Load Specific Environment
# Development
export ENV_FILE=.env.dev
source .env.dev
agentmd start
# Staging
export ENV_FILE=.env.staging
source .env.staging
agentmd start
# Production
export ENV_FILE=.env.production
source .env.production
agentmd start
Override Workspace Directory
# Run dev agent
agentmd run test-agent --workspace /workspace/dev
# Run staging agent
agentmd run daily-report --workspace /workspace/staging
# Run production agent
agentmd run daily-report --workspace /workspace/production
Override Model at Runtime
# Test production agent with dev model
agentmd run daily-report \
--workspace /workspace/production \
--model ollama \
--model-name llama3.1:8b
# Test with verbose logging
agentmd run daily-report \
--workspace /workspace/production \
--verbose \
--debug
One-Time Environment Variable
Database Separation
Separate Databases per Environment
Development database:
Staging database:
Production database:
Why Separate Databases?
- Isolation - Dev experiments don't pollute prod logs
- Performance - Prod database stays small and fast
- Debugging - Clear separation of test vs real executions
- Compliance - Production audit trail is separate
Backup Production Database
# Backup production database
sqlite3 /data/production/executions.db ".backup /backups/executions-$(date +%Y%m%d).db"
# Restore from backup
sqlite3 /data/production/executions.db ".restore /backups/executions-20260311.db"
Copy Staging to Dev
# Copy staging database to dev for testing
cp /data/staging/executions.db /data/dev/executions.db
# Now dev has staging's execution history
Environment Promotion Workflow
1. Develop in Dev
# Switch to dev environment
source .env.dev
# Create agent
cat > workspace/dev/new-feature.md << 'EOF'
---
name: new-feature
model:
provider: ollama
name: llama3.1:8b
---
New feature under development.
EOF
# Test
agentmd run new-feature --verbose
2. Promote to Staging
# Copy agent to staging
cp workspace/dev/new-feature.md workspace/staging/new-feature.md
# Update for staging
vim workspace/staging/new-feature.md
# Change model to gpt-3.5-turbo
# Update paths to /workspace/staging/
# Update schedule if needed
# Switch to staging environment
source .env.staging
# Test in staging
agentmd run new-feature --verbose
# If tests pass, commit
git add workspace/staging/new-feature.md
git commit -m "Add new-feature agent to staging"
git push origin staging
3. Promote to Production
# Copy agent to production
cp workspace/staging/new-feature.md workspace/production/new-feature.md
# Update for production
vim workspace/production/new-feature.md
# Change model to gpt-4 (or appropriate prod model)
# Update paths to /workspace/production/
# Set strict limits (max_tokens, timeout, max_iterations)
# Reduce log_level to WARNING
# Switch to production environment
source .env.production
# Dry run
agentmd run new-feature --dry-run
# If dry run passes, commit
git add workspace/production/new-feature.md
git commit -m "Deploy new-feature agent to production"
git push origin main
# Deploy (see deployment section)
Environment Switcher Script
scripts/env.sh:
#!/bin/bash
# Usage: source scripts/env.sh dev|staging|production
ENV=$1
if [ -z "$ENV" ]; then
echo "Usage: source scripts/env.sh dev|staging|production"
return 1
fi
case $ENV in
dev)
export ENV_FILE=.env.dev
;;
staging)
export ENV_FILE=.env.staging
;;
production)
export ENV_FILE=.env.production
;;
*)
echo "Unknown environment: $ENV"
return 1
;;
esac
# Load environment
set -a
source $ENV_FILE
set +a
echo "Switched to $ENV environment"
echo " WORKSPACE_DIR: $WORKSPACE_DIR"
echo " OUTPUT_DIR: $OUTPUT_DIR"
echo " DATABASE_PATH: $DATABASE_PATH"
echo " DEFAULT_MODEL: $DEFAULT_MODEL"
Usage:
# Switch to dev
source scripts/env.sh dev
# Switch to staging
source scripts/env.sh staging
# Switch to production
source scripts/env.sh production
Testing Across Environments
Validation Script
scripts/validate-all.sh:
#!/bin/bash
# Validate agents in all environments
echo "Validating dev agents..."
for agent in workspace/dev/*.md; do
agentmd validate "$agent" || exit 1
done
echo "Validating staging agents..."
for agent in workspace/staging/*.md; do
agentmd validate "$agent" || exit 1
done
echo "Validating production agents..."
for agent in workspace/production/*.md; do
agentmd validate "$agent" || exit 1
done
echo "All agents validated successfully!"
Integration Tests
scripts/test-staging.sh:
#!/bin/bash
# Run all staging agents and check for errors
source .env.staging
for agent in workspace/staging/*.md; do
agent_name=$(basename "$agent" .md)
echo "Testing $agent_name..."
agentmd run "$agent_name" --workspace /workspace/staging
if [ $? -ne 0 ]; then
echo "FAILED: $agent_name"
exit 1
fi
done
echo "All staging agents passed!"
Smoke Tests
scripts/smoke-test-prod.sh:
#!/bin/bash
# Quick smoke test of production agents
source .env.production
# Test each agent with dry-run
for agent in workspace/production/*.md; do
agent_name=$(basename "$agent" .md)
echo "Smoke testing $agent_name..."
agentmd run "$agent_name" --dry-run
if [ $? -ne 0 ]; then
echo "FAILED: $agent_name"
exit 1
fi
done
echo "Production smoke tests passed!"
Cost Monitoring
Track Costs by Environment
# Dev costs (should be $0 with local models)
agentmd logs --workspace /workspace/dev --format json | \
jq '[.[].total_tokens] | add'
# Staging costs
agentmd logs --workspace /workspace/staging --format json | \
jq '[.[].total_tokens] | add'
# Production costs
agentmd logs --workspace /workspace/production --format json | \
jq '[.[].total_tokens] | add'
Cost Alert Script
scripts/cost-alert.sh:
#!/bin/bash
# Alert if production costs exceed threshold
source .env.production
# Get total tokens from last 24 hours
TOKENS=$(agentmd logs --since "24 hours ago" --format json | \
jq '[.[].total_tokens] | add')
# Calculate cost (example: $0.03/1K input + $0.06/1K output for GPT-4)
# Simplified: assume 50/50 input/output split
COST=$(echo "scale=2; $TOKENS * 0.045 / 1000" | bc)
# Alert threshold: $10/day
THRESHOLD=10
if (( $(echo "$COST > $THRESHOLD" | bc -l) )); then
echo "ALERT: Production costs exceeded threshold!"
echo " Tokens: $TOKENS"
echo " Cost: \$$COST"
echo " Threshold: \$$THRESHOLD"
# Send alert (Slack, email, etc.)
# curl -X POST $SLACK_WEBHOOK -d "{'text': 'Production costs: \$$COST'}"
fi
Deployment
Dev Deployment (Local)
Staging Deployment
# Deploy to staging server
rsync -avz workspace/staging/ user@staging-server:/opt/agentmd/workspace/
rsync -avz .env.staging user@staging-server:/opt/agentmd/.env
# Restart staging runtime
ssh user@staging-server 'systemctl restart agentmd'
Production Deployment
# 1. Validate locally
source .env.production
for agent in workspace/production/*.md; do
agentmd validate "$agent" || exit 1
done
# 2. Backup production
ssh user@prod-server 'cp -r /opt/agentmd/workspace /opt/agentmd/workspace.backup'
# 3. Deploy
rsync -avz workspace/production/ user@prod-server:/opt/agentmd/workspace/
rsync -avz .env.production user@prod-server:/opt/agentmd/.env
# 4. Restart (with health check)
ssh user@prod-server 'systemctl restart agentmd && sleep 5 && systemctl status agentmd'
# 5. Smoke test
ssh user@prod-server 'agentmd validate /opt/agentmd/workspace/*.md'
Rollback Production
# Restore from backup
ssh user@prod-server 'cp -r /opt/agentmd/workspace.backup /opt/agentmd/workspace'
ssh user@prod-server 'systemctl restart agentmd'
Monitoring
Health Check Endpoints
# Dev
curl http://localhost:8000/health
# Staging
curl https://staging-agentmd.example.com/health
# Production
curl https://agentmd.example.com/health
Log Aggregation
# Collect logs from all environments
ssh user@staging-server 'agentmd logs --since "1 hour ago"' > staging-logs.txt
ssh user@prod-server 'agentmd logs --since "1 hour ago"' > prod-logs.txt
# Compare
diff staging-logs.txt prod-logs.txt
Secrets Management
Use Secret Manager (Production)
Instead of .env files in production, use a secret manager:
AWS Secrets Manager:
# Store secret
aws secretsmanager create-secret \
--name agentmd/production/openai-api-key \
--secret-string "sk-proj-REAL-SECRET"
# Retrieve in production startup script
export OPENAI_API_KEY=$(aws secretsmanager get-secret-value \
--secret-id agentmd/production/openai-api-key \
--query SecretString --output text)
Vault:
# Store secret
vault kv put secret/agentmd/production \
openai_api_key="sk-proj-REAL-SECRET"
# Retrieve
export OPENAI_API_KEY=$(vault kv get -field=openai_api_key secret/agentmd/production)
Environment Checklist
Before promoting to next environment:
Dev → Staging: - [ ] Agent works in dev - [ ] All paths updated to staging paths - [ ] Model switched from local to cloud (if applicable) - [ ] Schedule adjusted if needed - [ ] Committed to git
Staging → Production: - [ ] Agent tested in staging for 24+ hours - [ ] No errors in staging logs - [ ] Token usage is acceptable - [ ] Paths updated to production paths - [ ] Model is production-grade - [ ] Strict limits set (max_tokens, timeout, max_iterations) - [ ] Log level reduced (WARNING or ERROR) - [ ] Secrets are in secret manager (not .env) - [ ] Code review completed - [ ] Deployment plan documented - [ ] Rollback plan ready - [ ] Committed to git (main branch)