Automated Split DNS for a Kubernetes cluster

I self-host a number of services and want them to have clean domain names that work both inside and outside my network. On the LAN, requests should stay local and resolve to the internal load balancer. Away from home, the same hostnames should resolve publicly and route back through a VPS. I also want DNS records managed automatically from Kubernetes, with control over which services are internal-only, external, or both.

The approach I’ve settled on uses CoreDNS, etcd, and ExternalDNS for the internal side, with a wildcard DNS zone and a Tailscale-connected VPS for external access. The external tunnel setup is covered in another article, but this will focus on the DNS side.

Prerequisites

Before getting started, you’ll need a Kubernetes cluster with MetalLB (or another bare-metal load balancer) to assign IPs to your ingress services and a domain you control. In the examples below I am using Helm and Flux for managing deployments, but this is optional. For the external side, you’ll need a way of routing external traffic to your cluster, as mentioned I’m using a VPN tunnel which I covered in another article.

What Is Split DNS?

Split DNS returns different IP addresses for the same domain depending on where the query comes from. Internal queries return the cluster’s MetalLB load balancer IP. External queries resolve via a public wildcard DNS zone pointing to a VPS, which forwards traffic through Tailscale to the nginx ingress in the cluster.

This keeps internal traffic local and avoids unnecessary internet round trips, while the same hostnames work everywhere.

Internal DNS (Automated)

Three components work together to automatically manage internal DNS.

CoreDNS handles DNS resolution with two data sources: the etcd plugin for dynamic records from Kubernetes and static zone files as a fallback for things like printers and NAS devices. Queries are answered from whatever’s currently in etcd, no restarts or cache invalidation needed.

etcd is a distributed key-value store that acts as the DNS database. Records are stored under /skydns, updated by ExternalDNS, and queried by CoreDNS in real-time.

ExternalDNS watches Kubernetes resources (Ingress, Service, DNSEndpoint) and automatically publishes DNS records to etcd based on annotations.

External DNS (Manual)

External access uses a simpler, manually configured setup. A wildcard DNS zone (e.g., *.example.com) at your DNS provider points to the public IP of a VPS. The VPS runs Tailscale and forwards incoming traffic to the external nginx ingress in the cluster. The Tailscale operator exposes that ingress with a fixed Tailscale IP, so the VPS always knows where to send traffic.

Using a wildcard record has a privacy benefit: it avoids leaking which subdomains actually exist. Individual DNS records for each service would expose the names and number of services you’re running.

This only needs to be set up once. For details on the VPS and Tailscale forwarding, see my other article here.

Note

The external DNS configuration is currently manual. I plan to automate this in future so that external DNS records are also managed from Kubernetes, similar to how the internal side works today.

Segregated Nginx Ingress

The cluster runs two separate nginx ingress instances: one for internal traffic and one for external. Each has its own ingress class, so you control where a service is reachable at the manifest level.

internal-nginx routes traffic from the LAN via the internal MetalLB IP. nginx (external) routes traffic arriving from the VPS via Tailscale, reachable from the internet through the wildcard DNS zone.

To set this up, deploy two separate Helm releases of the ingress-nginx chart, each with a different ingressClass name and its own MetalLB service IP. The internal instance gets a LAN-routable IP, and the external instance gets a separate IP that the Tailscale operator exposes.

ExternalDNS watches both ingress classes and publishes records for all services to etcd, so internal clients can resolve hostnames for services on either ingress. However, the external wildcard DNS only forwards traffic to the external nginx ingress, so services using only internal-nginx are unreachable from outside the network. The external boundary is enforced by the architecture itself rather than just DNS.

A service can use one or both ingress classes. Something like a personal dashboard might only use internal-nginx, making it LAN-only. A service you want accessible remotely would use nginx or both.

Example: Exposing a Service

Here’s a real-world example using Home Assistant. The main web UI is exposed externally so it’s accessible from anywhere, while a code-server sidecar for editing configuration files is kept internal-only:

ingress:
  main:
    className: nginx # external ingress
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-prod
    hosts:
      - host: home.example.com
        paths:
          - path: /
            pathType: Prefix
            service:
              identifier: main
              port: http
  code:
    className: internal-nginx # internal only
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-prod
    hosts:
      - host: ha-config.example.internal
        paths:
          - path: /
            pathType: Prefix
            service:
              identifier: code
              port: http

The className is all that determines visibility. home.example.com uses the nginx ingress class, so it’s reachable both internally (via CoreDNS) and externally (via the wildcard DNS and VPS). ha-config.example.internal uses internal-nginx, so it’s only resolvable on the LAN.

ExternalDNS picks up both hostnames and publishes them to etcd automatically. No manual DNS changes needed when deploying or updating the service.

Configuration

etcd

You’ll need a simple etcd deployment, a single node is fine for this. The main thing is that it’s reachable from both ExternalDNS (to write records) and CoreDNS (to read them). If you don’t already have etcd running in your cluster, the etcd-io Helm chart ↗ or a simple StatefulSet will do the job. Just note the service IP or endpoint, since both ExternalDNS and CoreDNS need it.

ExternalDNS

Deploy and configure ExternalDNS to handle the internal DNS side. Here’s an example HelmRelease configuration using Flux:

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: &app external-dns-etcd
  namespace: networking
spec:
  interval: 15m
  chart:
    spec:
      chart: external-dns
      version: 1.20.0
      sourceRef:
        kind: HelmRepository
        name: external-dns
        namespace: flux-system
  maxHistory: 2
  install:
    createNamespace: true
    remediation:
      retries: 3
  upgrade:
    cleanupOnFail: true
    remediation:
      retries: 3
  uninstall:
    keepHistory: false
  values:
    fullnameOverride: *app
    sources: ["ingress", "crd", "service"]
    provider:
      name: coredns
    env:
      - name: ETCD_URLS
        value: "http://10.0.0.10:2379" # etcd IP
    extraArgs:
      - --crd-source-apiversion=externaldns.k8s.io/v1alpha1
      - --crd-source-kind=DNSEndpoint
      - --ignore-ingress-tls-spec
      - --ingress-class=internal-nginx
      - --ingress-class=nginx
    policy: upsert-only
    txtOwnerId: "internal-dns-coredns"
    domainFilters:
      - example.internal
      - example.com

The important bits: sources tells ExternalDNS to watch Ingress, Service, and DNSEndpoint resources. The provider is set to CoreDNS which uses etcd as its backend. The upsert-only policy prevents accidental deletions, and domainFilters restricts which domains ExternalDNS will manage. Both ingress classes are listed so records are published for internal and external services alike.

Refer to the ExternalDNS documentation ↗ for more detail, but this should work for most setups with etcd and ingresses in your cluster.

CoreDNS

The CoreDNS Corefile configuration:

.:53 {
    cache 3600
    forward . 1.1.1.1 8.8.8.8 {
        policy random
    }
    log
    errors
    whoami
}

example.internal:53 {
    etcd {
        path /skydns
        endpoint http://10.0.0.10:2379
        fallthrough
    }
    forward . 127.0.0.1:5553
    log
    errors
}

example.internal:5553 {
    file /etc/coredns/zones/db.example.internal
    log
    errors
}

example.com:53 {
    etcd {
        path /skydns
        endpoint http://10.0.0.10:2379
        fallthrough
    }
    log
    errors
}

The root zone (:53) forwards any unmatched queries to public DNS, so CoreDNS acts as a full resolver for your LAN. The example.internal zone checks etcd first for dynamic records, and if nothing matches, the fallthrough directive passes the query to a second zone on port 5553 that serves a static zone file. You can’t use both a zone file and the etcd plugin in the same block, which is why the static records are served on a separate port and forwarded to. The example.com zone just queries etcd for any dynamic records ExternalDNS has published.

Static Zone File

The static zone file (db.example.internal) is for devices that aren’t managed by Kubernetes but still need DNS entries on the LAN, though you may choose to manage these entries elsewhere:

$ORIGIN example.internal.
$TTL 1h

@                        IN  SOA  ns.example.internal. admin.example.internal. (
                                   2020010511     ; Serial
                                   1d             ; Refresh
                                   2h             ; Retry
                                   4w             ; Expire
                                   1h)            ; Minimum TTL

@                        IN  A    10.0.100.0
@                        IN  NS   ns.example.internal.

ns.example.internal.     IN  A    10.0.0.1
nas.example.internal.    IN  A    10.0.0.2

Update the IPs and hostnames to match your network. The serial number should be incremented whenever you change the file, CoreDNS uses it to detect updates.

Router DHCP

Finally, update your router’s DHCP settings to hand out the CoreDNS IP as the DNS server for clients on your LAN. Make sure your clients aren’t overriding the DHCP-provided DNS (some devices and browsers do this by default with DNS-over-HTTPS).

Verifying It Works

Once everything is deployed, you can verify the internal side with dig:

# Query CoreDNS directly for a dynamic record
dig home.example.com @<coredns-ip>

# Should return the internal MetalLB IP, e.g. 10.0.100.x

# Query a static record from the zone file
dig nas.example.internal @<coredns-ip>

# Should return 10.0.0.2

You can also check what ExternalDNS has written to etcd directly:

etcdctl get /skydns --prefix --keys-only

If records are showing up in etcd but not resolving through CoreDNS, double-check that the etcd endpoint is correct in your Corefile and that the fallthrough directive is present.

Summary

This setup gives me automatic internal DNS for anything running in the cluster, with split resolution so the same hostnames work on the LAN and over the internet. The internal side is fully automated: deploy a service with the right ingress class and the DNS entries are created. The external side uses a wildcard record pointing at a VPS, which keeps things simple and avoids leaking subdomain names.

There are a few things I’d like to improve in future: automating the external DNS so it’s managed from Kubernetes the same way the internal side is, making the etcd endpoint discoverable rather than hardcoded, and migrating from Ingress to Gateway API routes. I’ll cover those in follow-up posts.