Back to projects
NGINXHAProxyKeepalivedCertbotAnsibleDocker ComposeFail2banBash

High Availability Reverse Proxy

Production HA reverse proxy with NGINX, HAProxy, Keepalived VRRP failover, automated Let's Encrypt SSL/TLS (A+ score), Ansible configuration management, and failover testing

4 min read

Overview

A production-grade high availability reverse proxy and load balancing layer built for zero-downtime traffic management. Uses NGINX as the primary reverse proxy with HAProxy as an alternative, Keepalived for VRRP-based VIP failover, and Certbot with Let's Encrypt for automated SSL/TLS certificate management.

Designed to achieve an A+ SSL Labs score with hardened cipher suites, HSTS with preload, OCSP stapling, and modern TLS configuration. Deployed across multiple VMs with automated failover testing, Ansible configuration management, and comprehensive operational runbooks.

Key Features

High Availability

  • Keepalived VRRP — Virtual IP (VIP) automatically fails over from MASTER to BACKUP within 3 seconds
  • NGINX health tracking — Keepalived monitors NGINX process and port; triggers failover if either fails
  • Certificate sync — Certs automatically rsync'd from primary to standby after renewal
  • Automated failover testing — Script kills NGINX, verifies VIP migration, restores, and verifies VIP return

SSL/TLS Hardening (A+ Score)

  • TLSv1.2 and TLSv1.3 only — No legacy protocol support
  • Strong cipher suites — AEAD ciphers with PFS (ECDHE+AESGCM, CHACHA20)
  • HSTS with preload
    CODE
    Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
  • OCSP Stapling — Reduces TLS handshake latency and improves privacy
  • 4096-bit DH parameters — Custom Diffie-Hellman parameters
  • Session resumption — Shared session cache across workers
  • Certificate transparency — Full chain with intermediate certificates

Load Balancing

  • Multiple algorithms — Round-robin, least_conn, ip_hash configurable per upstream
  • Active health checks — Periodic backend health verification with configurable intervals
  • WebSocket support — Full proxy support for WebSocket connections (Upgrade headers)
  • Connection draining — Graceful backend removal during deployments
  • Rate limiting — Per-IP rate limits for general traffic (10r/s), API (5r/s), and login (1r/s)

Security Headers

  • CODE
    X-Frame-Options: DENY
  • CODE
    X-Content-Type-Options: nosniff
  • CODE
    X-XSS-Protection: 1; mode=block
  • CODE
    Referrer-Policy: strict-origin-when-cross-origin
  • CODE
    Content-Security-Policy
    with strict policy
  • CODE
    Permissions-Policy
    restricting browser features

Ansible Automation

  • 6 roles — nginx, keepalived, certbot, fail2ban, haproxy, common (hardening)
  • 6 playbooks — site (master), proxy-setup, ssl-setup, firewall, hardening, backend-setup
  • Jinja2 templates — All configs parameterized with variables for multi-environment use
  • Systemd timers — Modern cert renewal (not cron) with randomized delay
  • CIS benchmark hardening — SSH config, sysctl tuning, firewall rules, audit logging

Architecture

CODE
1 DNS (Round Robin) 23 ┌─────────┴─────────┐ 4 ▼ ▼ 5┌──────────┐ ┌──────────┐ 6│ PROXY-01 │◄─VRRP─►│ PROXY-02 │ 7│ (MASTER) │ │ (BACKUP) │ 8│ │ │ │ 9│ Keepalived│ │ Keepalived│ 10│ NGINX │ │ NGINX │ 11│ Certbot │◄─sync►│ Certbot │ 12│ Fail2ban │ │ Fail2ban │ 13└────┬─────┘ └────┬─────┘ 14 │ VIP: 10.0.0.100 │ 15 └────────┬───────────┘ 1617 ┌─────────┼─────────┐ 18 ▼ ▼ ▼ 19┌────────┐ ┌────────┐ ┌────────┐ 20│App-01 │ │App-02 │ │App-03 │ 21│:8080 │ │:8080 │ │:8080 │ 22└────────┘ └────────┘ └────────┘

Technical Implementation

Keepalived VRRP Configuration

Two proxy nodes run Keepalived with a shared

CODE
virtual_router_id
. The MASTER node (priority 101) holds the VIP. If NGINX stops or the health check script fails, Keepalived transitions to FAULT state and the BACKUP node (priority 100) assumes the VIP within one advert interval (1 second default).

Certificate Management

Certbot handles initial certificate issuance and renewal via DNS-01 or HTTP-01 challenges. A systemd timer runs

CODE
certbot renew
daily with a randomized delay. On successful renewal, a deploy hook reloads NGINX and rsyncs the new certificates to the peer proxy node.

NGINX Worker Tuning

NGINX
1worker_processes auto; 2worker_connections 4096; 3multi_accept on; 4# TCP optimizations 5sendfile on; 6tcp_nopush on; 7tcp_nodelay on; 8keepalive_timeout 65; 9keepalive_requests 1000;

Failover Testing

The automated failover test script:

  1. Sends continuous HTTP requests to the VIP
  2. Kills NGINX on the MASTER node
  3. Measures time until VIP migrates (target: < 3 seconds)
  4. Restarts NGINX
  5. Verifies VIP returns to MASTER
  6. Reports total dropped requests and failover time

Deployment

Docker (Local Testing)

Bash
1cp .env.example .env 2docker compose -f docker/docker-compose.yml up -d 3./testing/failover-test.sh

Ansible (Production)

Bash
1cd ansible/ 2ansible-playbook -i inventory/hosts.yml playbooks/site.yml

Impact

  • < 3 second automatic failover with Keepalived VRRP
  • A+ SSL Labs score with hardened cipher suites and HSTS preload
  • Zero-downtime certificate renewal with automated deploy hooks and peer sync
  • 3 load balancing algorithms configurable per upstream group
  • 4 rate limit zones protecting against abuse
  • CIS benchmark hardening on all proxy nodes
  • 6 Ansible roles for fully automated, idempotent deployment

Future Plans

  • Add ModSecurity WAF for application-layer protection
  • Implement mutual TLS (mTLS) for backend authentication
  • Add GeoIP-based routing for geographic traffic management
  • Integrate Prometheus NGINX Exporter for real-time metrics
  • Add Chaos testing with random backend kills and network partitions
  • Deploy HAProxy as L4 load balancer in front of NGINX for extreme throughput