For Droplets created before 2 October 2024, you must manually add VPC peering routes to interconnect with VPC-native DOKS clusters
How to Troubleshoot CoreDNS Issues in DOKS Clusters
Validated on 21 Nov 2025 • Last edited on 13 Jan 2026
DOKS uses CoreDNS for in-cluster DNS management. This article provides steps to diagnose CoreDNS issues, collect diagnostic information, and determine whether the problem is related to configuration, resource constraints, or network connectivity.
Common Symptoms of CoreDNS Issues
Common symptoms of CoreDNS issues include:
- Pods that cannot resolve external domain names.
- Pods that cannot resolve internal Kubernetes service names.
- Intermittent DNS resolution failures or timeouts.
- Application logs showing DNS lookup errors.
- CoreDNS pods showing high CPU or memory utilization.
- CoreDNS pods restarting frequently.
Diagnostic Checks
Run DNS resolution tests to help identify the type of CoreDNS issue. Start by testing internal DNS resolution from within your cluster:
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup kubernetes.defaultA successful lookup returns the IP address of the Kubernetes API service. If this command times out or returns an error like server can't find kubernetes.default, CoreDNS cannot resolve internal service names. Next, test external DNS resolution:
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup google.comA successful lookup returns IP addresses for the external domain. If the lookup fails, CoreDNS cannot resolve external domains.
The combined results help narrow down the issue type:
- Internal DNS fails, external succeeds: The
kubernetesplugin cannot query the Kubernetes API, or network policies block pod-to-CoreDNS communication. - Both internal and external DNS fail: CoreDNS cannot reach upstream DNS servers, or the CoreDNS pods are not running.
- Both tests succeed: If you’re still experiencing DNS issues, the problem may be intermittent, specific to certain pods or services, or related to DNS performance rather than complete failure. Proceed to gather diagnostic information with particular focus on identifying which pods are affected and when the issues occur.
Gather Diagnostic Information
Gather detailed information about your cluster, CoreDNS pods, and configuration to help identify the root cause and provide context for troubleshooting or support requests.
Establish the Incident Timeline and Scope
Record the times (in UTC) when DNS resolution issues began and ended. Accurate timestamps are important for correlating logs and metrics across different systems during troubleshooting.
Note the cluster’s ID, name, and datacenter region by going to the Settings tab in the control panel, or by running:
kubectl cluster-infoDocument whether the issue affects all pods, specific deployments, or particular node pools.
Gather Application Pod Information
Identify which pods are experiencing CoreDNS issues and their node locations. This helps determine whether the problem is isolated to specific nodes or affects the entire cluster.
List your application pods and note which nodes they’re running on:
kubectl get pods -n <your-namespace> -o wideList CoreDNS pods and note their node locations:
kubectl get pods -n=kube-system -o wide | grep -i corednsIf affected application pods are all on the same node as a specific CoreDNS pod, the issue may be node-specific rather than cluster-wide.
Collect CoreDNS Pod Status and Logs
Verify that CoreDNS pods are running:
kubectl get pods -n=kube-system -l=k8s-app=kube-dnsHealthy CoreDNS pods show STATUS: Running and READY: 1/1. If pods show CrashLoopBackOff, Pending, or frequent restarts, CoreDNS cannot serve DNS queries.
Collect logs from all CoreDNS containers with timestamps:
kubectl logs --timestamps -l=k8s-app=kube-dns --all-containers=true -n=kube-systemWhen reviewing logs, note error patterns such as:
timeoutori/o timeout: Upstream DNS servers unreachable.SERVFAIL: Upstream DNS server cannot resolve query.read: connection refused: Cannot reach upstream DNS servers.
Check for CoreDNS-related events:
kubectl get events -n=kube-systemIf you forward logs to external systems like OpenSearch or Loki, retrieve CoreDNS logs from those systems for the incident time frame.
Analyze Resource Utilization
Check the current resource allocation and usage for CoreDNS pods:
kubectl top pods -n=kube-system -l=k8s-app=kube-dnsIf usage is consistently above 80-90% of limits, CoreDNS is resource-constrained. Check for OOMKilled or eviction events:
kubectl get events -n=kube-system | grep -i "oom\|evict\|memory"If you have monitoring tools like Prometheus and Grafana, review metrics for CoreDNS pods covering at least 30 minutes before and after the incident.
If resource constraints are identified, see How can I improve the performance of cluster DNS? for scaling strategies.
Review CoreDNS Configuration
View your CoreDNS deployment details:
kubectl describe deployment coredns -n=kube-systemCheck for configuration issues that can cause DNS failures, such as insufficient replicas (default is 2) or image version mismatches with your Kubernetes version.
View the CoreDNS ConfigMap:
kubectl get configmap -n=kube-system coredns -o yamlCheck for a custom CoreDNS configuration:
kubectl get configmap -n=kube-system coredns-custom -o yamlReview these ConfigMaps for potential configuration issues, such as custom upstream DNS servers pointing to incorrect addresses. For more information, see How to Customize CoreDNS in DOKS.
Check for pod-level DNS overrides:
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 10 "dnsPolicy\|dnsConfig"If dnsPolicy is set to None or Default, the pod bypasses CoreDNS and uses the node’s DNS resolver.
Investigate Node Conditions
Check if nodes are using shared or dedicated CPU Droplets:
kubectl get nodes --show-labels | grep node.kubernetes.io/instance-typeShared Droplets can experience CPU steal during high load, causing intermittent CoreDNS slowness. Check if nodes show pressure or have a false Ready condition:
kubectl describe nodesCommon Issues and Solutions
Refer to the table below to match your symptoms with common CoreDNS issues and recommended solutions.
| Issue | Symptoms | Solution |
|---|---|---|
| Resource constraints | High CPU/memory usage, slow resolution, timeouts | Scale horizontally (add replicas) or vertically (increase limits). See DNS performance guide |
| External DNS fails | External domains don’t resolve, internal works | Check upstream DNS configuration, verify network connectivity to DigitalOcean DNS |
| Internal DNS fails | Kubernetes services don’t resolve, external works | Check kubernetes plugin configuration, verify cluster domain (default: cluster.local) |
| High query rate | CoreDNS at resource limits, high request rate | Enable NodeLocal DNSCache. See DNS performance guide |
| Frequent restarts | OOMKilled events, high restart count |
Increase memory limits or enable caching |
If you’re unable to resolve the issue, open a support ticket and provide the details gathered in the previous steps:
- Incident timeline (start/end times in UTC)
- Cluster ID and region
- Symptoms observed (internal/external resolution failures)
- CoreDNS pod logs (attach full log file)
- CoreDNS ConfigMap configuration
- Node and pod information (affected nodes, resource utilization)
- Any custom CoreDNS configuration or pod-level DNS overrides
- Monitoring data (if available)
Related Topics
Health checks often fail due to firewalls or misconfigured backend server software.
Enable DNS caching, use non-shared machine types for the cluster, and scale out or reduce DNS traffic.