How to Troubleshoot CoreDNS Issues in DOKS Clusters

Validated on 21 Nov 2025 • Last edited on 13 Jan 2026

DOKS uses CoreDNS for in-cluster DNS management. This article provides steps to diagnose CoreDNS issues, collect diagnostic information, and determine whether the problem is related to configuration, resource constraints, or network connectivity.

Common Symptoms of CoreDNS Issues

Common symptoms of CoreDNS issues include:

  • Pods that cannot resolve external domain names.
  • Pods that cannot resolve internal Kubernetes service names.
  • Intermittent DNS resolution failures or timeouts.
  • Application logs showing DNS lookup errors.
  • CoreDNS pods showing high CPU or memory utilization.
  • CoreDNS pods restarting frequently.

Diagnostic Checks

Run DNS resolution tests to help identify the type of CoreDNS issue. Start by testing internal DNS resolution from within your cluster:

kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup kubernetes.default

A successful lookup returns the IP address of the Kubernetes API service. If this command times out or returns an error like server can't find kubernetes.default, CoreDNS cannot resolve internal service names. Next, test external DNS resolution:

kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup google.com

A successful lookup returns IP addresses for the external domain. If the lookup fails, CoreDNS cannot resolve external domains.

The combined results help narrow down the issue type:

  • Internal DNS fails, external succeeds: The kubernetes plugin cannot query the Kubernetes API, or network policies block pod-to-CoreDNS communication.
  • Both internal and external DNS fail: CoreDNS cannot reach upstream DNS servers, or the CoreDNS pods are not running.
  • Both tests succeed: If you’re still experiencing DNS issues, the problem may be intermittent, specific to certain pods or services, or related to DNS performance rather than complete failure. Proceed to gather diagnostic information with particular focus on identifying which pods are affected and when the issues occur.

Gather Diagnostic Information

Gather detailed information about your cluster, CoreDNS pods, and configuration to help identify the root cause and provide context for troubleshooting or support requests.

Establish the Incident Timeline and Scope

Record the times (in UTC) when DNS resolution issues began and ended. Accurate timestamps are important for correlating logs and metrics across different systems during troubleshooting.

Note the cluster’s ID, name, and datacenter region by going to the Settings tab in the control panel, or by running:

kubectl cluster-info

Document whether the issue affects all pods, specific deployments, or particular node pools.

Gather Application Pod Information

Identify which pods are experiencing CoreDNS issues and their node locations. This helps determine whether the problem is isolated to specific nodes or affects the entire cluster.

List your application pods and note which nodes they’re running on:

kubectl get pods -n <your-namespace> -o wide

List CoreDNS pods and note their node locations:

kubectl get pods -n=kube-system -o wide | grep -i coredns

If affected application pods are all on the same node as a specific CoreDNS pod, the issue may be node-specific rather than cluster-wide.

Collect CoreDNS Pod Status and Logs

Verify that CoreDNS pods are running:

kubectl get pods -n=kube-system -l=k8s-app=kube-dns

Healthy CoreDNS pods show STATUS: Running and READY: 1/1. If pods show CrashLoopBackOff, Pending, or frequent restarts, CoreDNS cannot serve DNS queries.

Collect logs from all CoreDNS containers with timestamps:

kubectl logs --timestamps -l=k8s-app=kube-dns --all-containers=true -n=kube-system

When reviewing logs, note error patterns such as:

  • timeout or i/o timeout: Upstream DNS servers unreachable.
  • SERVFAIL: Upstream DNS server cannot resolve query.
  • read: connection refused: Cannot reach upstream DNS servers.

Check for CoreDNS-related events:

kubectl get events -n=kube-system

If you forward logs to external systems like OpenSearch or Loki, retrieve CoreDNS logs from those systems for the incident time frame.

Analyze Resource Utilization

Check the current resource allocation and usage for CoreDNS pods:

kubectl top pods -n=kube-system -l=k8s-app=kube-dns

If usage is consistently above 80-90% of limits, CoreDNS is resource-constrained. Check for OOMKilled or eviction events:

kubectl get events -n=kube-system | grep -i "oom\|evict\|memory"

If you have monitoring tools like Prometheus and Grafana, review metrics for CoreDNS pods covering at least 30 minutes before and after the incident.

If resource constraints are identified, see How can I improve the performance of cluster DNS? for scaling strategies.

Review CoreDNS Configuration

View your CoreDNS deployment details:

kubectl describe deployment coredns -n=kube-system

Check for configuration issues that can cause DNS failures, such as insufficient replicas (default is 2) or image version mismatches with your Kubernetes version.

View the CoreDNS ConfigMap:

kubectl get configmap -n=kube-system coredns -o yaml

Check for a custom CoreDNS configuration:

kubectl get configmap -n=kube-system coredns-custom -o yaml

Review these ConfigMaps for potential configuration issues, such as custom upstream DNS servers pointing to incorrect addresses. For more information, see How to Customize CoreDNS in DOKS.

Check for pod-level DNS overrides:

kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 10 "dnsPolicy\|dnsConfig"

If dnsPolicy is set to None or Default, the pod bypasses CoreDNS and uses the node’s DNS resolver.

Investigate Node Conditions

Check if nodes are using shared or dedicated CPU Droplets:

kubectl get nodes --show-labels | grep node.kubernetes.io/instance-type

Shared Droplets can experience CPU steal during high load, causing intermittent CoreDNS slowness. Check if nodes show pressure or have a false Ready condition:

kubectl describe nodes

Common Issues and Solutions

Refer to the table below to match your symptoms with common CoreDNS issues and recommended solutions.

Issue Symptoms Solution
Resource constraints High CPU/memory usage, slow resolution, timeouts Scale horizontally (add replicas) or vertically (increase limits). See DNS performance guide
External DNS fails External domains don’t resolve, internal works Check upstream DNS configuration, verify network connectivity to DigitalOcean DNS
Internal DNS fails Kubernetes services don’t resolve, external works Check kubernetes plugin configuration, verify cluster domain (default: cluster.local)
High query rate CoreDNS at resource limits, high request rate Enable NodeLocal DNSCache. See DNS performance guide
Frequent restarts OOMKilled events, high restart count Increase memory limits or enable caching

If you’re unable to resolve the issue, open a support ticket and provide the details gathered in the previous steps:

  • Incident timeline (start/end times in UTC)
  • Cluster ID and region
  • Symptoms observed (internal/external resolution failures)
  • CoreDNS pod logs (attach full log file)
  • CoreDNS ConfigMap configuration
  • Node and pod information (affected nodes, resource utilization)
  • Any custom CoreDNS configuration or pod-level DNS overrides
  • Monitoring data (if available)
Why can't my VPC-native pods connect to my Droplets?

For Droplets created before 2 October 2024, you must manually add VPC peering routes to interconnect with VPC-native DOKS clusters

How to Troubleshoot Load Balancer Health Check Issues

Health checks often fail due to firewalls or misconfigured backend server software.

How can I improve the performance of cluster DNS?

Enable DNS caching, use non-shared machine types for the cluster, and scale out or reduce DNS traffic.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.