AI Starter Package
Learn/AI 301/Lesson 6
6 of 8 · 40 min

Production Deployment

VPS Setup for Agents

Agents need a persistent environment that stays running between sessions. A VPS (Virtual Private Server) gives you a dedicated machine where agents can execute tasks, store state, and run 24/7:

  • Minimum specs: 2 vCPU, 4 GB RAM for a 3-agent team. Scale to 4 vCPU / 8 GB when running 6+ agents concurrently. CPU matters more than RAM for orchestration workloads.
  • OS choice: Ubuntu LTS or Debian stable. Both have excellent package support and long-term security updates. Avoid bleeding-edge distributions — stability beats novelty in production.
  • Initial setup: Create a non-root user, enable SSH key authentication, disable password login, configure UFW firewall to allow only SSH and your monitoring port. This takes 10 minutes and prevents the most common attack vectors.
  • Project structure: Clone your agent repository to /opt/agents/. Keep configuration in /etc/agents/. Store logs in /var/log/agents/. Separation of code, config, and logs simplifies debugging and updates.

Environment Secrets

API keys, database credentials, and tokens must never appear in code or git history. Production secret management follows a strict hierarchy:

  • Environment variables: Store secrets in /etc/agents/.env with chmod 600 permissions. Load them via your process manager, not by sourcing files in scripts.
  • Rotation policy: Rotate API keys every 90 days. Automate this where possible. A compromised key with a 90-day rotation window limits blast radius.
  • Separation by environment: Maintain separate .env.staging and .env.production files. Never use production keys in development. A stray test can burn through your API budget in minutes.

Cron-Based Scheduling

Most agent workflows run on a schedule — daily code reviews, hourly health checks, weekly reports. Cron is simple, reliable, and already installed on every Linux server:

  • Daily tasks: Run code quality audits at 2 AM when API rate limits are lowest. Use 0 2 * * * in crontab.
  • Hourly tasks: Health checks, log rotation, and metric collection. Use 0 * * * * for on-the-hour execution.
  • Overlap prevention: Use flock to prevent overlapping executions. If a daily audit takes longer than 24 hours, the next run should skip, not stack.
  • Failure alerts: Pipe cron output to a log file and monitor it. A silent cron failure that goes unnoticed for a week defeats the purpose of automation.

Process Management

Long-running agents need a process manager that handles restarts, logging, and resource limits. systemd is the standard on modern Linux:

  • Service files: Create a systemd unit file for each agent or agent group. Define restart policies, environment files, working directories, and resource limits in a declarative format.
  • Restart policy: Use Restart=on-failure with RestartSec=30. This handles crashes without creating restart loops. Set StartLimitBurst=5 to stop after 5 rapid failures.
  • Resource limits: Set MemoryMax=2G and CPUQuota=50% per agent. A runaway agent should not starve other processes on the machine.

Monitoring and Alerts

A deployed system without monitoring is a system you will debug at 3 AM with no information. Set up three alert tiers:

  • Critical (immediate): Agent process crashed and did not restart. All agents failing simultaneously. Disk usage above 90%. These page you via SMS or push notification.
  • Warning (within hours): Error rate above 10%. Task queue growing faster than agents can process. API budget above 80% of monthly limit. These send an email or Slack message.
  • Info (daily digest): Tasks completed, average execution time, success rate, token usage. Review these daily to spot trends before they become incidents.

Practical Exercise

Set up a production-ready deployment for a 3-agent team on a VPS or local VM:

  • Create a non-root user and configure SSH key authentication
  • Write a .env file with your API keys and set file permissions to 600
  • Create a systemd service file for your agent orchestrator with restart policies
  • Add a cron job that runs a health check every hour and logs the results
  • Set up a simple alert that emails you when the agent process crashes
  • Deploy, verify the agent starts on boot, and test recovery by killing the process

Skip the infrastructure work

Our Managed Agent plans include a fully configured VPS with process management, monitoring, alerts, and automatic updates. You focus on agent logic — we handle the ops.

View Pricing →