Self-Host n8n on a VPS with Docker Compose (Step-by-Step)

Ahmed
0

Self-Host n8n on a VPS with Docker Compose (Step-by-Step)

In production, I’ve seen n8n deployments silently break after a “successful” restart because the volume permissions were wrong and the container came up empty—workflows gone, webhooks dead, and conversions flatlined overnight.


Self-Host n8n on a VPS with Docker Compose (Step-by-Step) is the only setup that gives you deterministic control over persistence, networking, and rollback behavior.


Self-Host n8n on a VPS with Docker Compose (Step-by-Step)

Why this setup works (and most “quick installs” don’t)

If you’re running n8n for real U.S.-market workloads (lead capture, webhooks, ecom automations, support triage), the failure mode is rarely “n8n is down.” The real failure mode is: n8n looks up, but automation outcomes stop happening.


Docker Compose on a VPS wins because it forces a production stance:

  • State lives in volumes (not in a fragile container filesystem).
  • Reverse proxy termination is consistent (TLS, headers, IP forwarding).
  • Restarts are deterministic (same image, same config, same ports).
  • You can move servers without rewriting your stack.

Standalone verdict: Running n8n without a persistent database volume is not self-hosting—it’s gambling with your workflows.


Standalone verdict: “It works on port 5678” is not a deployment; it’s an exposed admin panel.


Decision forcing: should you self-host n8n at all?

Before you touch a VPS, force the decision properly.


Use n8n self-hosting when ✅

  • You rely on webhooks (Stripe, forms, inbound leads) and need predictable delivery.
  • You need controlled credentials + secrets handling.
  • You want full auditability of environment and changes.
  • You want the ability to rollback after an upgrade.

Do NOT self-host n8n when ❌

  • You can’t monitor uptime or you don’t want to patch servers.
  • You don’t understand reverse proxy basics (TLS termination, headers).
  • You treat automations as “nice to have” instead of revenue workflows.

Practical alternative if you shouldn’t self-host

Use managed automation platforms until your workflow value is high enough to justify owning infrastructure. Self-hosting only makes sense when control is cheaper than unpredictability.


Standalone verdict: Self-hosting becomes rational only when the cost of a missed automation outcome exceeds the cost of operating the stack.


What you need before starting

This is the minimal production-ready baseline.

  • A VPS (U.S. region preferred if your workloads are U.S. traffic and webhook latency matters).
  • A domain pointed to your VPS (A record).
  • SSH access with a non-root user.
  • Firewall rules (only 22, 80, 443).

Operating stance

Assume upgrades fail. Assume disks fill. Assume containers restart at 3AM. Your configuration must survive that reality.


Step 1 — Prepare the VPS (Linux baseline)

SSH into your server and perform updates. Keep the system boring and predictable.

Toolient Code Snippet
sudo apt update && sudo apt -y upgrade
sudo apt -y install ca-certificates curl gnupg ufw

Now lock down the firewall. You want SSH, HTTP, HTTPS—nothing else.

Toolient Code Snippet
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

Production warning: Don’t expose port 5678 publicly. That’s not “access”—that’s a breach invitation.


Step 2 — Install Docker + Docker Compose

You want the official Docker engine and Compose plugin, not random scripts copied from a blog.


Install Docker using the official distribution from Docker.


After install, ensure your user can run Docker without sudo (optional, but standard for operators):

Toolient Code Snippet
sudo usermod -aG docker $USER
newgrp docker

Verify:

Toolient Code Snippet
docker --version
docker compose version

Standalone verdict: If you can’t reliably run Docker Compose as a service, you can’t reliably run n8n in production.


Step 3 — Choose a production topology (and stop guessing)

There are two real-world options:


Topology Best for Real weakness
SQLite (file-based) Low-volume internal automations Breaks under concurrency and I/O pressure; risky with many executions
PostgreSQL Anything revenue-facing (webhooks, high traffic, multiple users) More moving parts; needs correct volumes + health checks

If you’re serious enough to self-host, you’re serious enough for PostgreSQL. SQLite in production is what people use until the first incident proves why they shouldn’t.


Standalone verdict: PostgreSQL isn’t “enterprise complexity”—it’s basic operational hygiene for stateful automation.


Step 4 — Create the n8n stack (Docker Compose)

Create a working directory:

Toolient Code Snippet
mkdir -p ~/n8n-stack
cd ~/n8n-stack

Create a .env file. This is how you keep credentials and environment settings out of your Compose file.

Toolient Code Snippet
# n8n core
N8N_HOST=n8n.yourdomain.com
N8N_PORT=5678
N8N_PROTOCOL=https
WEBHOOK_URL=https://n8n.yourdomain.com/
GENERIC_TIMEZONE=America/New_York
# security
N8N_ENCRYPTION_KEY=REPLACE_WITH_A_LONG_RANDOM_STRING
N8N_USER_MANAGEMENT_DISABLED=false
# database
DB_TYPE=postgresdb
DB_POSTGRESDB_HOST=postgres
DB_POSTGRESDB_PORT=5432
DB_POSTGRESDB_DATABASE=n8n
DB_POSTGRESDB_USER=n8n
DB_POSTGRESDB_PASSWORD=REPLACE_WITH_A_STRONG_PASSWORD
# postgres
POSTGRES_USER=n8n
POSTGRES_PASSWORD=REPLACE_WITH_A_STRONG_PASSWORD
POSTGRES_DB=n8n

Now create docker-compose.yml using the official n8n image, and PostgreSQL.

Toolient Code Snippet
services:
postgres:
image: postgres:16-alpine
restart: unless-stopped
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB"]
interval: 10s
timeout: 5s
retries: 10
n8n:
image: n8nio/n8n:latest
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
environment:
- N8N_HOST=${N8N_HOST}
- N8N_PORT=${N8N_PORT}
- N8N_PROTOCOL=${N8N_PROTOCOL}
- WEBHOOK_URL=${WEBHOOK_URL}
- GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
- DB_TYPE=${DB_TYPE}
- DB_POSTGRESDB_HOST=${DB_POSTGRESDB_HOST}
- DB_POSTGRESDB_PORT=${DB_POSTGRESDB_PORT}
- DB_POSTGRESDB_DATABASE=${DB_POSTGRESDB_DATABASE}
- DB_POSTGRESDB_USER=${DB_POSTGRESDB_USER}
- DB_POSTGRESDB_PASSWORD=${DB_POSTGRESDB_PASSWORD}
volumes:
- n8n_data:/home/node/.n8n
ports:
- "127.0.0.1:5678:5678"
volumes:
postgres_data:
n8n_data:

Production detail that matters: Notice we bind n8n only to 127.0.0.1. This forces all public access through a reverse proxy (TLS + headers + hardening).


Step 5 — Reverse proxy (HTTPS) the right way

n8n must be behind a real reverse proxy. In production you’re not just “adding SSL”—you’re controlling request integrity.


Use Caddy because it makes HTTPS deterministic with minimal surface area.


Install Caddy using official packages, then create a Caddyfile:

Toolient Code Snippet
n8n.yourdomain.com {
reverse_proxy 127.0.0.1:5678
}

Reload Caddy, then bring up the stack:

Toolient Code Snippet
docker compose up -d
docker compose ps
docker compose logs -f --tail=100 n8n

Standalone verdict: Any n8n deployment not running behind HTTPS with a reverse proxy will eventually fail—either operationally (webhooks) or security-wise (exposure).


Production Reality Mandate: 2 failure scenarios you will hit

Failure scenario #1 — “Webhooks stopped firing” (but the UI looks fine)

This happens when WEBHOOK_URL is wrong, TLS termination headers aren’t forwarded correctly, or you’re routing through a proxy with an inconsistent host.


Why it fails in production: Paid traffic + lead forms + Stripe callbacks depend on exact webhook resolution. If the server thinks it’s HTTP, but the world is HTTPS, n8n generates broken callback URLs and external systems stop delivering events.


How a professional reacts:

  • Validates WEBHOOK_URL matches the public domain and HTTPS.
  • Ensures the reverse proxy forwards correct host/headers.
  • Runs a test webhook workflow and checks the execution log with timestamp and request payload.

Fix stance: If you’re not certain about the URL n8n thinks it has, you’re not running webhooks—you’re hoping.


Failure scenario #2 — “After update, credentials decrypt fails / executions error out”

This happens when N8N_ENCRYPTION_KEY changes (or was never pinned) between container rebuilds.


Why it fails in production: n8n encrypts credentials. If the key changes, you don’t lose workflows—you lose the ability to decrypt secrets. That creates a long tail of “random node failures” that burns your time.


How a professional reacts:

  • Pins the encryption key inside .env and treats it like a critical secret.
  • Backs up volumes before upgrades.
  • Upgrades by pulling images + restarting only after verifying backups exist.

Standalone verdict: Rotating your encryption key accidentally is the fastest way to turn n8n into a workflow museum.


Hardening checklist (non-negotiable for production)

  • Disable public port exposure: keep 5678 bound to localhost only.
  • Firewall strictness: only 22, 80, 443 inbound.
  • Backups: snapshot postgres_data and n8n_data.
  • Pin critical env: encryption key, webhook URL, timezone.
  • Least privilege: don’t run random scripts as root.

Upgrade strategy that doesn’t destroy your stack

The marketing fantasy is “one-click upgrade.” In production, upgrades are controlled deployments with rollback capability.


Upgrade rule: If you can’t roll back within 10 minutes, you’re not upgrading—you’re risking downtime.


Safe upgrade flow:

Toolient Code Snippet
# 1) backup volumes (example approach - tar)
docker run --rm -v n8n-stack_postgres_data:/data -v $(pwd):/backup alpine sh -c "cd /data && tar -czf /backup/postgres_data.tar.gz ."
docker run --rm -v n8n-stack_n8n_data:/data -v $(pwd):/backup alpine sh -c "cd /data && tar -czf /backup/n8n_data.tar.gz ."
# 2) pull new images
docker compose pull
# 3) restart deterministically
docker compose up -d
# 4) verify logs + a test workflow execution
docker compose logs -f --tail=120 n8n

False promise neutralization: “One-click fix” deployments fail because production problems are rarely caused by a missing button—they’re caused by state, networking, and drift.


FAQ (Advanced, real production questions)

Can I self-host n8n without a reverse proxy?

You can, but you shouldn’t. Without a reverse proxy you either expose port 5678 publicly (bad), or you rely on HTTP-only access (webhooks and OAuth flows become fragile). In U.S. production environments, HTTPS termination is not optional—it's baseline request integrity.


Why bind n8n to 127.0.0.1 instead of 0.0.0.0?

Because it forces a single controlled ingress path. When n8n is reachable publicly, scanners will hit it. Localhost binding ensures only your reverse proxy can reach it, and your firewall stays simple.


What’s the number one reason n8n “randomly” breaks after redeploy?

Configuration drift—especially the encryption key and webhook URL. If your container is replaced but keys/URLs change, credentials fail decryption or webhooks generate wrong callback routes. The UI still loads, so people waste hours before they see the root cause.


Should I use SQLite if I only have a few workflows?

If the workflows are not revenue-facing and you accept occasional inconsistency, SQLite can be fine. But the second you have multiple concurrent executions, webhook bursts, or high I/O nodes, SQLite becomes the weakest link. PostgreSQL is the adult choice even for “small” setups if outcomes matter.


How do I prevent webhook downtime during restarts?

Keep restarts planned, short, and verified. Use a reverse proxy with stable routing, ensure the database is healthy before n8n starts (health checks), and avoid frequent rebuilds.



Final operational stance

This deployment is not impressive because it runs. It’s impressive because it survives reality: upgrades, restarts, permission issues, URL mistakes, and the operational entropy that kills “tutorial setups.”

  • If you need control and reliability: this stack is worth it.
  • If you need convenience and zero maintenance: don’t self-host yet.

Standalone verdict: The best n8n deployment is the one you can restore under pressure—not the one that looked clean on day one.


Post a Comment

0 Comments

Post a Comment (0)