Kubernetes Deployment

Mezite ships a Helm chart for deploying to Kubernetes at deploy/helm/mezite/ in the repository.

Prerequisites

  • Kubernetes 1.27+
  • Helm 3.12+
  • A PostgreSQL 16 instance (managed or self-hosted)

Installation

Install with Helm bash
# Install from the local chart in the repository
helm install mezite ./deploy/helm/mezite/ \
  --namespace mezite \
  --create-namespace \
  --set database.host=postgres \
  --set database.password=secret \
  --set proxy.publicAddr=mezite.example.com:443 \
  --set caKeyPassphrase=change-me

values.yaml

Key configuration options in deploy/helm/mezite/values.yaml:

values.yaml (key fields) yaml
replicas: 2

image:
  repository: ghcr.io/leonardaustin/mezite
  tag: latest
  pullPolicy: IfNotPresent

clusterName: mezite

# Database connection. For an externally-managed PostgreSQL (RDS, Aurora,
# CloudSQL), set database.external.enabled=true and provide the connection
# details under database.external.*.
database:
  host: postgres
  port: 5432
  name: mezite
  user: mezite
  password: ""
  sslmode: require
  external:
    enabled: false
    host: ""
    port: 5432
    name: ""
    user: ""
    sslmode: require
    existingSecret: ""  # K8s secret name with a 'password' key

# CA signing-key passphrase (required for production).
caKeyPassphrase: ""

proxy:
  publicAddr: ""        # LB hostname clients connect to
  maxConnsPerIP: 100

# Service ports. ClusterIP by default; expose via an Ingress, Gateway,
# or LB Service per your platform conventions.
service:
  type: ClusterIP
  ports:
    https: 3080
    ssh: 3023
    tunnel: 3024
    grpc: 3025

persistence:
  enabled: true
  size: 10Gi
  accessMode: ReadWriteOnce
  storageClass: ""

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: "1"
    memory: 512Mi

log:
  level: info
  format: json

PostgreSQL Setup

The Helm chart does not bundle PostgreSQL — point database.* at an existing instance. For tests or trial installs, the Bitnami PostgreSQL chart works well:

Example: deploy PostgreSQL with Helm bash
helm install pg bitnami/postgresql \
  --namespace mezite \
  --set auth.username=mezite \
  --set auth.password=secret \
  --set auth.database=mezite

Configuration

Override any value at install time or with a custom values file:

Custom values file bash
helm install mezite ./deploy/helm/mezite/ \
  --namespace mezite \
  --create-namespace \
  -f my-values.yaml

Scaling

The auth and proxy services share a single Deployment and are stateless (all state lives in PostgreSQL plus the persistent volume), so they can be scaled horizontally:

Scale the deployment bash
kubectl scale deployment mezite-mezhub --replicas=3 -n mezite

Agents maintain persistent reverse tunnels to the proxy. When scaling the proxy, ensure your load balancer supports long-lived TCP connections on port 3024 (the agent tunnel port).

Ingress and Single-Port (ALPN) Routing

The recommended public shape for a Kubernetes deployment is the same shape that managed Mezite uses: one external port (:443) carrying HTTPS, SSH, and the agent tunnel, demultiplexed by TLS ALPN. This is what the proxy's proxy.single_port mode is built for, and it is the only shape that reliably traverses corporate firewalls and L4 load balancers.

Two viable topologies in Kubernetes:

  • L4 Service of type LoadBalancermezhub Pod with proxy.single_port=true. The LB does no L7 work; TLS is terminated by mezhub itself, which means agent reverse tunnels and SSH traffic remain end-to-end TLS to the Pod. This is the simplest shape and the easiest one to reason about under cert rotation.
  • Ingress / Gateway with TLS passthrough on the public listener, routing to the mezhub Service's :3080 port. Most Ingress controllers (nginx, HAProxy, Traefik) support TLS passthrough; do not use TLS termination at the Ingress here — terminating TLS at the Ingress would break the ALPN demultiplexing the proxy relies on for SSH and tunnel routing.

Set proxy.public_addr to the external hostname (mezite.example.com:443) so WebAuthn and the cluster's enrollment URL match what clients connect to. See Configuration for the env-var spelling.

High Availability

Mezite supports running multiple mezhub replicas behind a single Service. Most code paths are stateless because all durable state lives in PostgreSQL; the remaining state-machine paths (CA rotation phase advances, certain bootstrap actions) coordinate via the database so multiple replicas can run safely.

The IAM challenge store used by the AWS IAM join method is the one exception — it lives in-process today, so the IAM join flow assumes a single replica per cluster. For multi-replica clusters that do not use IAM-based joins (i.e. agents join with bootstrap tokens via mezctl tokens create), this is not a constraint. The database-backed challenge store work is tracked separately.

Recommendations for an HA deployment:

  • Use an external, managed PostgreSQL with at least one replica.
  • Set replicas: 2 or higher in values.yaml. Use a PodDisruptionBudget that keeps at least one Pod ready during voluntary disruptions.
  • Configure the Service to send agent traffic (:3024 / single-port :443) to all replicas; each agent reverse tunnel ends at exactly one Pod, but the cluster-side state machine routes session traffic to the right Pod via the database-backed inventory.
  • Run database migrations once per upgrade. The current Helm chart in deploy/helm/mezite/ does not ship a migration hook — for HA upgrades, operators should run mezhub migrate up out-of-band (from a one-shot Pod, a bastion, or a CI step) before scaling the Deployment up, so multiple Pods don't race the schema. A future chart revision may add a pre-install Job with a leader-elected init container; until then, treat migrations as an explicit pre-rollout step.

NetworkPolicies

For a hardened cluster, restrict mezhub Pod ingress to the listeners you expose externally, and restrict its egress to the database and (when used) AWS KMS. The minimum useful NetworkPolicy for the proxy looks like:

NetworkPolicy: minimum useful policy for mezhub yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: mezhub
  namespace: mezite
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: mezite
  policyTypes: [Ingress, Egress]
  ingress:
    # External traffic comes from the Ingress / LoadBalancer; restrict to
    # the listeners you actually expose.
    - ports:
        - protocol: TCP
          port: 3080      # HTTPS (or 443 in single-port mode)
        - protocol: TCP
          port: 3025      # gRPC Auth (mezctl admin + agent registration)
        - protocol: TCP
          port: 3023      # SSH
        - protocol: TCP
          port: 3024      # agent tunnel
  egress:
    # Database (in-cluster PostgreSQL example — see below for external PG).
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: postgresql
      ports:
        - protocol: TCP
          port: 5432
    # DNS
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
    # AWS KMS (only needed when kms.enabled=true). Allow egress to AWS API
    # endpoints; in practice this is "443 to the internet" or to a VPC
    # endpoint for kms.
    - ports:
        - protocol: TCP
          port: 443

The example above assumes Postgres runs in-cluster behind a podSelector. For externally-managed PostgreSQL (RDS, Cloud SQL, a self-managed VM, or a PG operator in a different namespace), replace the database egress rule with an ipBlock for the database's address (a single /32 or the VPC subnet's CIDR) or a namespaceSelector + podSelector targeting the operator's namespace. For example:

NetworkPolicy egress: external PostgreSQL via ipBlock yaml
  egress:
    # External Postgres (e.g. RDS endpoint resolved to a private subnet).
    - to:
        - ipBlock:
            cidr: 10.0.0.0/16    # your DB subnet (use a /32 for a single host)
      ports:
        - protocol: TCP
          port: 5432

Multi-tenant managed deployments add a per-tenant tenant-isolation NetworkPolicy on top of the above to stop cross-tenant TCP. See the Troubleshooting guide for how to diagnose policy denials.