coredns Pod具有CrashLoopBackOff或Error状态

我正在尝试通过发出以下命令来设置Kubernetes主机:

kubeadm初始化–pod-network-cidr = 192.168.0.0 / 16

  1. 随后:安装Pod Network附加组件(Calico)
  2. 跟:主隔离


问题:coredns豆荚有CrashLoopBackOffError状态:

# kubectl get pods -n kube-system

NAME READY STATUS RESTARTS AGE

calico-node-lflwx 2/2 Running 0 2d

coredns-576cbf47c7-nm7gc 0/1 CrashLoopBackOff 69 2d

coredns-576cbf47c7-nwcnx 0/1 CrashLoopBackOff 69 2d

etcd-suey.nknwn.local 1/1 Running 0 2d

kube-apiserver-suey.nknwn.local 1/1 Running 0 2d

kube-controller-manager-suey.nknwn.local 1/1 Running 0 2d

kube-proxy-xkgdr 1/1 Running 0 2d

kube-scheduler-suey.nknwn.local 1/1 Running 0 2d

#

我尝试对kubeadm-

Kubernetes进行故障排除,但是我的节点未运行,SELinux并且Docker是最新的。

# docker --version

Docker version 18.06.1-ce, build e68fc7a

#

kubectldescribe

# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx 

Name: coredns-576cbf47c7-nwcnx

Namespace: kube-system

Priority: 0

PriorityClassName: <none>

Node: suey.nknwn.local/192.168.86.81

Start Time: Sun, 28 Oct 2018 22:39:46 -0400

Labels: k8s-app=kube-dns

pod-template-hash=576cbf47c7

Annotations: cni.projectcalico.org/podIP: 192.168.0.30/32

Status: Running

IP: 192.168.0.30

Controlled By: ReplicaSet/coredns-576cbf47c7

Containers:

coredns:

Container ID: docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79

Image: k8s.gcr.io/coredns:1.2.2

Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

Ports: 53/UDP, 53/TCP, 9153/TCP

Host Ports: 0/UDP, 0/TCP, 0/TCP

Args:

-conf

/etc/coredns/Corefile

State: Running

Started: Wed, 31 Oct 2018 23:28:58 -0400

Last State: Terminated

Reason: Error

Exit Code: 137

Started: Wed, 31 Oct 2018 23:21:35 -0400

Finished: Wed, 31 Oct 2018 23:23:54 -0400

Ready: True

Restart Count: 103

Limits:

memory: 170Mi

Requests:

cpu: 100m

memory: 70Mi

Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

Environment: <none>

Mounts:

/etc/coredns from config-volume (ro)

/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

Type Status

Initialized True

Ready True

ContainersReady True

PodScheduled True

Volumes:

config-volume:

Type: ConfigMap (a volume populated by a ConfigMap)

Name: coredns

Optional: false

coredns-token-xvq8b:

Type: Secret (a volume populated by a Secret)

SecretName: coredns-token-xvq8b

Optional: false

QoS Class: Burstable

Node-Selectors: <none>

Tolerations: CriticalAddonsOnly

node-role.kubernetes.io/master:NoSchedule

node.kubernetes.io/not-ready:NoExecute for 300s

node.kubernetes.io/unreachable:NoExecute for 300s

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal Killing 54m (x10 over 4h19m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

Warning Unhealthy 9m56s (x92 over 4h20m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503

Warning BackOff 5m4s (x173 over 4h10m) kubelet, suey.nknwn.local Back-off restarting failed container

# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc

Name: coredns-576cbf47c7-nm7gc

Namespace: kube-system

Priority: 0

PriorityClassName: <none>

Node: suey.nknwn.local/192.168.86.81

Start Time: Sun, 28 Oct 2018 22:39:46 -0400

Labels: k8s-app=kube-dns

pod-template-hash=576cbf47c7

Annotations: cni.projectcalico.org/podIP: 192.168.0.31/32

Status: Running

IP: 192.168.0.31

Controlled By: ReplicaSet/coredns-576cbf47c7

Containers:

coredns:

Container ID: docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359

Image: k8s.gcr.io/coredns:1.2.2

Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

Ports: 53/UDP, 53/TCP, 9153/TCP

Host Ports: 0/UDP, 0/TCP, 0/TCP

Args:

-conf

/etc/coredns/Corefile

State: Running

Started: Wed, 31 Oct 2018 23:29:11 -0400

Last State: Terminated

Reason: Error

Exit Code: 137

Started: Wed, 31 Oct 2018 23:21:58 -0400

Finished: Wed, 31 Oct 2018 23:24:08 -0400

Ready: True

Restart Count: 102

Limits:

memory: 170Mi

Requests:

cpu: 100m

memory: 70Mi

Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

Environment: <none>

Mounts:

/etc/coredns from config-volume (ro)

/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

Type Status

Initialized True

Ready True

ContainersReady True

PodScheduled True

Volumes:

config-volume:

Type: ConfigMap (a volume populated by a ConfigMap)

Name: coredns

Optional: false

coredns-token-xvq8b:

Type: Secret (a volume populated by a Secret)

SecretName: coredns-token-xvq8b

Optional: false

QoS Class: Burstable

Node-Selectors: <none>

Tolerations: CriticalAddonsOnly

node-role.kubernetes.io/master:NoSchedule

node.kubernetes.io/not-ready:NoExecute for 300s

node.kubernetes.io/unreachable:NoExecute for 300s

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal Killing 44m (x12 over 4h18m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

Warning BackOff 4m58s (x170 over 4h9m) kubelet, suey.nknwn.local Back-off restarting failed container

Warning Unhealthy 8s (x102 over 4h19m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503

#

kubectllog

# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc 

E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974857 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.975493 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.976732 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.977788 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.976164 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.977415 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.978332 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating

E1101 03:33:31.976864 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.978080 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.979156 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

#

# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd

.:53

2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected

# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws

.:53

2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

#

describeapiserver):

# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local

Name: kube-apiserver-suey.nknwn.local

Namespace: kube-system

Priority: 2000000000

PriorityClassName: system-cluster-critical

Node: suey.nknwn.local/192.168.87.20

Start Time: Fri, 02 Nov 2018 00:28:44 -0400

Labels: component=kube-apiserver

tier=control-plane

Annotations: kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c

kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c

kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00

kubernetes.io/config.source: file

scheduler.alpha.kubernetes.io/critical-pod:

Status: Running

IP: 192.168.87.20

Containers:

kube-apiserver:

Container ID: docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc

Image: k8s.gcr.io/kube-apiserver:v1.12.2

Image ID: docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f

Port: <none>

Host Port: <none>

Command:

kube-apiserver

--authorization-mode=Node,RBAC

--advertise-address=192.168.87.20

--allow-privileged=true

--client-ca-file=/etc/kubernetes/pki/ca.crt

--enable-admission-plugins=NodeRestriction

--enable-bootstrap-token-auth=true

--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt

--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt

--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key

--etcd-servers=https://127.0.0.1:2379

--insecure-port=0

--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt

--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key

--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt

--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key

--requestheader-allowed-names=front-proxy-client

--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt

--requestheader-extra-headers-prefix=X-Remote-Extra-

--requestheader-group-headers=X-Remote-Group

--requestheader-username-headers=X-Remote-User

--secure-port=6443

--service-account-key-file=/etc/kubernetes/pki/sa.pub

--service-cluster-ip-range=10.96.0.0/12

--tls-cert-file=/etc/kubernetes/pki/apiserver.crt

--tls-private-key-file=/etc/kubernetes/pki/apiserver.key

State: Running

Started: Sun, 04 Nov 2018 22:57:27 -0500

Last State: Terminated

Reason: Completed

Exit Code: 0

Started: Sun, 04 Nov 2018 20:12:06 -0500

Finished: Sun, 04 Nov 2018 22:55:24 -0500

Ready: True

Restart Count: 2

Requests:

cpu: 250m

Liveness: http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8

Environment: <none>

Mounts:

/etc/ca-certificates from etc-ca-certificates (ro)

/etc/kubernetes/pki from k8s-certs (ro)

/etc/ssl/certs from ca-certs (ro)

/usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)

/usr/share/ca-certificates from usr-share-ca-certificates (ro)

Conditions:

Type Status

Initialized True

Ready True

ContainersReady True

PodScheduled True

Volumes:

etc-ca-certificates:

Type: HostPath (bare host directory volume)

Path: /etc/ca-certificates

HostPathType: DirectoryOrCreate

k8s-certs:

Type: HostPath (bare host directory volume)

Path: /etc/kubernetes/pki

HostPathType: DirectoryOrCreate

ca-certs:

Type: HostPath (bare host directory volume)

Path: /etc/ssl/certs

HostPathType: DirectoryOrCreate

usr-share-ca-certificates:

Type: HostPath (bare host directory volume)

Path: /usr/share/ca-certificates

HostPathType: DirectoryOrCreate

usr-local-share-ca-certificates:

Type: HostPath (bare host directory volume)

Path: /usr/local/share/ca-certificates

HostPathType: DirectoryOrCreate

QoS Class: Burstable

Node-Selectors: <none>

Tolerations: :NoExecute

Events: <none>

#

syslog(主机):

11月4日22:59:36 suey kubelet [1234]:E1104 22:59:36.139538 1234

pod_workers.go:186]同步pod d8146b7e-de57-11e8-a1e2-ec8eb57434c8(“

coredns-576cbf47c7-hhmws_kube-system(d8146b7e

de57-11e8-a1e2-ec8eb57434c8)“),跳过:使用CrashLoopBackOff无法为“ coredns”启动到“

StartContainer”:“退回40s重新启动失败的容器= coredns pod =

coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8 -a1e2-ec8eb57434c8)”

请指教。

回答:

这个错误

[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

是当CoreDNS在解析配置中检测到循环时引起的,这是预期的行为。您遇到了以下问题:

https://github.com/kubernetes/kubeadm/issues/1162

https://github.com/coredns/coredns/issues/2087

编辑CoreDNS配置图:

kubectl -n kube-system edit configmap coredns

使用删除或注释掉该行loop,然后保存并退出。

然后删除CoreDNS Pod,以便可以使用新的配置创建新的Pod:

kubectl -n kube-system delete pod -l k8s-app=kube-dns

在那之后一切都会好起来的。

首先,检查您是否正在使用systemd-resolved。如果您正在运行Ubuntu 18.04,则可能是这种情况。

systemctl list-unit-files | grep enabled | grep systemd-resolved

如果是,请检查resolv.conf集群使用哪个文件作为参考:

ps auxww | grep kubelet

您可能会看到类似以下的行:

/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf

重要的部分是--resolv-conf-我们确定是否使用了systemd resolv.conf。

检查的内容/run/systemd/resolve/resolv.conf以查看是否有类似的记录:

nameserver 127.0.0.1

如果存在127.0.0.1,则它是导致循环的原因。

要摆脱它,您不应编辑该文件,而应检查其他位置以使其正确生成。

检查下的所有文件,/etc/systemd/network以及是否找到类似的记录

DNS=127.0.0.1

删除该记录。还要检查/etc/systemd/resolved.conf并根据需要执行相同的操作。确保至少配置了一台或两台DNS服务器,例如

DNS=1.1.1.1 1.0.0.1

完成所有这些之后,重新启动systemd服务以使更改生效:systemctl restart systemd-networkd systemd-

resolved

之后,确认文件中DNS=127.0.0.1不再存在该resolv.conf文件:

cat /run/systemd/resolve/resolv.conf

最后,触发DNS容器的重新创建

kubectl -n kube-system delete pod -l k8s-app=kube-dns

解决方案涉及从主机DNS配置中消除看起来像DNS查找循环的内容。不同的resolv.conf管理器/实现之间的步骤有所不同。

以上是 coredns Pod具有CrashLoopBackOff或Error状态 的全部内容, 来源链接: utcz.com/qa/429604.html

回到顶部