Operations Medium

Deployment

Overview

Deployment is the boundary between development and operation. A good deployment system makes this boundary transparent. A bad one creates fear, delay, and error.

Principles

Reproducibility

Every deployment should be reproducible from source. No manual steps. No “works on my machine.” The deployment artifact must be identical regardless of who triggers it.

Observability

Deployment is not complete when the command finishes. It is complete when the system is healthy. Health checks must verify functionality, not just process existence.

Rollback Safety

Every deployment must be reversible within 5 minutes. This constraint shapes every deployment design decision.

Strategies

Blue/Green

Two identical environments. Traffic switches from blue to green. Zero downtime, instant rollback. Cost: double compute during deployment.

Rolling

Replace instances gradually. No duplicate infrastructure. Risk: mixed-state during deployment. Rollback is slower.

Canary

Deploy to 5% of traffic first. Monitor metrics. Gradually increase. Best for high-traffic services. Requires metric-driven decision making.

This Project’s Approach

Static sites use rolling deployment (no state, instant propagation). API services use blue/green (stateful connections require clean cutover).

Trade-offs

  • Speed vs. safety: Faster deployments reduce lead time but increase blast radius
  • Cost vs. resilience: Blue/green costs more but eliminates deployment downtime
  • Automation vs. control: Fully automated deployments require high confidence in test coverage

Implementation Checklist

  • Build artifact is immutable and tagged
  • Database migrations are backward-compatible
  • Health checks verify application functionality
  • Rollback is automated and tested
  • Deployment events are logged and alerted
  • Secrets are injected at runtime, not baked into images

Lessons

The most common deployment failure is not technical. It is procedural: deploying on Friday afternoon, skipping health checks, or deploying without rollback capability.

Technical debt in deployment is more expensive than technical debt in code. Bad code can be refactored. Bad deployment can take the system offline.

Maintenance

  • Review deployment frequency monthly (target: daily minimum)
  • Audit rollback procedures quarterly (test in staging)
  • Review failed deployments within 24 hours
  • Update runbooks when procedures change