导
无意发现生产一个服务的Pod重启次数偏高(小公司监控啥的还不到位,勿嫌弃),通过查看Pod信息发现Pod退出原因是OOM,也就是内存越限.
同时也可以看到退出时间是7月12日中午12点16分,然后查看Pod被调度到了哪个节点上,去该节点上查看journalctl日志进一步查看原因。
1、发现重启次数较多的Pod
# kubectl get pod -A
production systembusiness-deployment-5c6969978b-6m4gx 1/1 Running 14 6d18h
production systembusiness-deployment-5c6969978b-hx5lt 1/1 Running 12 6d18h
production systembusiness-deployment-5c6969978b-qfh5x 1/1 Running 27 6d18h
从上面的输出可以看到该Deployment下的3个Pod重启次数普遍偏多
# kubectl describe pod -n production systembusiness-deployment-5c6969978b-hx5lt
State: Running
Started: Mon, 12 Jul 2021 12:10:20 +0800
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 09 Jul 2021 20:44:31 +0800
Finished: Mon, 12 Jul 2021 12:10:20 +0800
Ready: True
Restart Count: 27
Limits:
cpu: 500m
memory: 1000Mi
Requests:
cpu: 100m
memory: 100Mi
从describe的输出中重点关注
资源限制Limits
和Last State
两部分,从Last State
部分可以看到退出原因是OOMKilled
,也就是内存越限,被杀死的时间是7月12号的中午12点10分特别注意一下这里 de 资源限制:
cpu限制为500m
,内存限制为1000Mi
2、摸查内存越限情况
为了进一步排查原因,首先查看Pod被调度到了哪个节点,随后去该节点上查看journal的日志和kubelet日志
# kubectl get pod -A -owide | grep systembusiness
production systembusiness-deployment-5c6969978b-6m4gx 1/1 Running 15 6d22h 10.244.154.209 k8s-node-01 <none> <none>
production systembusiness-deployment-5c6969978b-hx5lt 1/1 Running 13 6d22h 10.244.127.255 k8s-node-04 <none> <none>
production systembusiness-deployment-5c6969978b-qfh5x 1/1 Running 29 6d22h 10.244.162.29 k8s-node-08 <none> <none>
选取其中一个Pod到节点上查看kubelet日志(k8s-node-04)
k8s-node-04 使用journalctl根据时间筛选日志
# journalctl --since="2021-07-12 08:00:00"
7月 12 12:11:16 k8s-node-04 kernel: systembus invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=994
7月 12 12:11:16 k8s-node-04 kernel: systembusin cpuset=docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b389e18984737c3118fdd7a5e.scope mems_allowed=0
7月 12 12:11:16 k8s-node-04 kernel: CPU: 2 PID: 1735957 Comm: systembusin Tainted: G OE --------- - - 4.18.0-193.14.2.el8_2.x86_64 #1
7月 12 12:11:16 k8s-node-04 kernel: Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014
7月 12 12:11:16 k8s-node-04 kernel: Call Trace:
7月 12 12:11:16 k8s-node-04 kernel: dump_stack+0x5c/0x80
7月 12 12:11:16 k8s-node-04 kernel: dump_header+0x6e/0x27a
7月 12 12:11:16 k8s-node-04 kernel: oom_kill_process.cold.28+0xb/0x10
7月 12 12:11:16 k8s-node-04 kernel: out_of_memory+0x1ba/0x490
7月 12 12:11:16 k8s-node-04 kernel: mem_cgroup_out_of_memory+0x49/0x80
7月 12 12:11:16 k8s-node-04 kernel: try_charge+0x6fa/0x780
7月 12 12:11:16 k8s-node-04 kernel: ? __alloc_pages_nodemask+0xef/0x280
7月 12 12:11:16 k8s-node-04 kernel: mem_cgroup_try_charge+0x8b/0x1a0
7月 12 12:11:16 k8s-node-04 kernel: mem_cgroup_try_charge_delay+0x1c/0x40
7月 12 12:11:16 k8s-node-04 kernel: do_anonymous_page+0xb5/0x360
7月 12 12:11:16 k8s-node-04 kernel: __handle_mm_fault+0x662/0x6a0
7月 12 12:11:16 k8s-node-04 kernel: handle_mm_fault+0xda/0x200
7月 12 12:11:16 k8s-node-04 kernel: __do_page_fault+0x22d/0x4e0
7月 12 12:11:16 k8s-node-04 kernel: do_page_fault+0x32/0x110
7月 12 12:11:16 k8s-node-04 kernel: ? async_page_fault+0x8/0x30
7月 12 12:11:16 k8s-node-04 kernel: async_page_fault+0x1e/0x30
7月 12 12:11:16 k8s-node-04 kernel: RIP: 0033:0x466bbf
7月 12 12:11:16 k8s-node-04 kernel: Code: 00 00 c5 fe 6f 06 c5 fe 6f 4e 20 c5 fe 6f 56 40 c5 fe 6f 5e 60 48 81 c6 80 00 00 00 c5 fd e7 07 c5 fd e7 4f 20 c5 fd e7 57 40 <c5> fd e7 5f 60 48 81 c7 80 00 00 00 48 81 eb >
7月 12 12:11:16 k8s-node-04 kernel: RSP: 002b:000000c000478a98 EFLAGS: 00010202
7月 12 12:11:16 k8s-node-04 kernel: RAX: 00000000000812dc RBX: 00000000000c06c0 RCX: 000000c0399096e0
7月 12 12:11:16 k8s-node-04 kernel: RDX: 000000c030880000 RSI: 000000c030bc9020 RDI: 000000c039848fa0
7月 12 12:11:16 k8s-node-04 kernel: RBP: 000000c000478af0 R08: 000000c039500000 R09: 000000c030880000
7月 12 12:11:16 k8s-node-04 kernel: R10: 0000000000000020 R11: 0581f4fae9181cae R12: 0000000000000002
7月 12 12:11:16 k8s-node-04 kernel: R13: 00000000016839e0 R14: 0000000000000000 R15: 0000000000465500
7月 12 12:11:16 k8s-node-04 kernel: Task in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice/docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b389e18984737c3118f>
7月 12 12:11:16 k8s-node-04 kernel: memory: usage 1024000kB, limit 1024000kB, failcnt 821
7月 12 12:11:16 k8s-node-04 kernel: memory+swap: usage 1024000kB, limit 9007199254740988kB, failcnt 0
7月 12 12:11:16 k8s-node-04 kernel: kmem: usage 5100kB, limit 9007199254740988kB, failcnt 0
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_>
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice/docker-42aa34971c05a3bc340d91a71c026a77830d496f05>
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice/docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b38>
7月 12 12:11:16 k8s-node-04 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
7月 12 12:11:16 k8s-node-04 kernel: [3897555] 0 3897555 256 1 32768 0 -998 pause
7月 12 12:11:16 k8s-node-04 kernel: [1735923] 0 1735923 426082 258638 2158592 0 994 systembusin
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup out of memory: Killed process 1735923 (systembusin) total-vm:1704328kB, anon-rss:1017592kB, file-rss:17492kB, shmem-rss:0kB, UID:0
7月 12 12:11:16 k8s-node-04 kernel: oom_reaper: reaped process 1735923 (systembusin), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
7月 12 12:11:16 k8s-node-04 systemd[1]: docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b389e18984737c3118fdd7a5e.scope: Consumed 2min 25.839s CPU time
7月 12 12:11:16 k8s-node-04 kernel: mem_cgroup_try_charge+0x8b/0x1a0
7月 12 12:11:16 k8s-node-04 kernel: mem_cgroup_try_charge_delay+0x1c/0x40
7月 12 12:11:16 k8s-node-04 kernel: do_anonymous_page+0xb5/0x360
7月 12 12:11:16 k8s-node-04 kernel: __handle_mm_fault+0x662/0x6a0
7月 12 12:11:16 k8s-node-04 kernel: handle_mm_fault+0xda/0x200
7月 12 12:11:16 k8s-node-04 kernel: __do_page_fault+0x22d/0x4e0
7月 12 12:11:16 k8s-node-04 kernel: do_page_fault+0x32/0x110
7月 12 12:11:16 k8s-node-04 kernel: ? async_page_fault+0x8/0x30
7月 12 12:11:16 k8s-node-04 kernel: async_page_fault+0x1e/0x30
7月 12 12:11:16 k8s-node-04 kernel: RIP: 0033:0x466bbf
7月 12 12:11:16 k8s-node-04 kernel: Code: 00 00 c5 fe 6f 06 c5 fe 6f 4e 20 c5 fe 6f 56 40 c5 fe 6f 5e 60 48 81 c6 80 00 00 00 c5 fd e7 07 c5 fd e7 4f 20 c5 fd e7 57 40 <c5> fd e7 5f 60 48 81 c7 80 00 00 00 48 81 eb >
7月 12 12:11:16 k8s-node-04 kernel: RSP: 002b:000000c000478a98 EFLAGS: 00010202
7月 12 12:11:16 k8s-node-04 kernel: RAX: 00000000000812dc RBX: 00000000000c06c0 RCX: 000000c0399096e0
7月 12 12:11:16 k8s-node-04 kernel: RDX: 000000c030880000 RSI: 000000c030bc9020 RDI: 000000c039848fa0
7月 12 12:11:16 k8s-node-04 kernel: RBP: 000000c000478af0 R08: 000000c039500000 R09: 000000c030880000
7月 12 12:11:16 k8s-node-04 kernel: R10: 0000000000000020 R11: 0581f4fae9181cae R12: 0000000000000002
7月 12 12:11:16 k8s-node-04 kernel: R13: 00000000016839e0 R14: 0000000000000000 R15: 0000000000465500
7月 12 12:11:16 k8s-node-04 kernel: Task in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice/docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b389e18984737c3118f>
7月 12 12:11:16 k8s-node-04 kernel: memory: usage 1024000kB, limit 1024000kB, failcnt 821
7月 12 12:11:16 k8s-node-04 kernel: memory+swap: usage 1024000kB, limit 9007199254740988kB, failcnt 0
7月 12 12:11:16 k8s-node-04 kernel: kmem: usage 5100kB, limit 9007199254740988kB, failcnt 0
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_>
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice/docker-42aa34971c05a3bc340d91a71c026a77830d496f05>
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf05bdc36_dd14_4537_9b91_57cc07df0932.slice/docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b38>
7月 12 12:11:16 k8s-node-04 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
7月 12 12:11:16 k8s-node-04 kernel: [3897555] 0 3897555 256 1 32768 0 -998 pause
7月 12 12:11:16 k8s-node-04 kernel: [1735923] 0 1735923 426082 258638 2158592 0 994 systembusin
7月 12 12:11:16 k8s-node-04 kernel: Memory cgroup out of memory: Killed process 1735923 (systembusin) total-vm:1704328kB, anon-rss:1017592kB, file-rss:17492kB, shmem-rss:0kB, UID:0
7月 12 12:11:16 k8s-node-04 kernel: oom_reaper: reaped process 1735923 (systembusin), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
7月 12 12:11:16 k8s-node-04 systemd[1]: docker-052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b389e18984737c3118fdd7a5e.scope: Consumed 2min 25.839s CPU time
7月 12 12:11:16 k8s-node-04 systemd[1]: libcontainer-4178438-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
7月 12 12:11:16 k8s-node-04 systemd[1]: libcontainer-4178438-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
7月 12 12:11:16 k8s-node-04 systemd[1]: Created slice libcontainer_4178438_systemd_test_default.slice.
7月 12 12:11:16 k8s-node-04 systemd[1]: Removed slice libcontainer_4178438_systemd_test_default.slice.
7月 12 12:11:16 k8s-node-04 systemd[1]: libcontainer-4178461-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
7月 12 12:11:16 k8s-node-04 systemd[1]: libcontainer-4178461-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
7月 12 12:11:16 k8s-node-04 systemd[1]: Created slice libcontainer_4178461_systemd_test_default.slice.
7月 12 12:11:16 k8s-node-04 systemd[1]: Removed slice libcontainer_4178461_systemd_test_default.slice.
7月 12 12:11:16 k8s-node-04 dockerd[2687797]: time="2021-07-12T12:11:16+08:00" level=error msg="stat cgroup 052d52a163eb1ced8ddbd9968b31cb1d1ec74a2b389e18984737c3118fdd7a5e" error="cgroups: cgroup deleted"
7月 12 12:11:16 k8s-node-04 systemd[1]: libcontainer-4178468-systemd-test-default-dependencies.scope: Scope has no PIDs. Refusing.
先看这一行输出:
k8s-node-04 kernel: Memory cgroup out of memory: Killed process 1735923 (systembusin) total-vm:1704328kB, anon-rss:1017592kB, file-rss:17492kB, shmem-rss:0kB, UID:0
从日志输出首先可以看出,Pod进程总共使用的虚拟内存total-vm:1704328kB
,前边describe
的时候也注意过Pod的资源限制内存
是1000Mi
,现在用了1664Mi
左右,导致内存越限被杀死
。由此,我们也可以排除掉不是node节点内存不够导致Pod被杀死
,而是Pod进程占用内存过多导致越限而被杀死
然后重点注意下面日志部分:
7月 12 12:11:16 k8s-node-04 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
7月 12 12:11:16 k8s-node-04 kernel: [3897555] 0 3897555 256 1 32768 0 -998 pause
7月 12 12:11:16 k8s-node-04 kernel: [1735923] 0 1735923 426082 258638 2158592 0 994 systembusin
由上边的部分输出来看,对于Pod来说,主要有两个进程作为
OOM killer
的候选者,其中pause 进程
负责为Pod中的业务容器创建共享的network等资源,它的oom_score_adj值为-998
,可以确保它不会因为内存原因被系统杀死(oom_score_adj 值越低就越不容易被杀死
,可参考这里). 这个Pod中剩下的那个进程就是真正的业务容器,它的oom_score_adj值奇高,毫无疑问就是因为它Pod才会被OOM Killed
OOM killer 会根据进程的内存使用情况来计算 oom_score 的值,并根据 oom_score_adj 的值来进行微调
。
如果Pod中有多进程,谁的内存使用量最多,
oom_score
值就越高,也就越容易被杀死。也就是说,当Pod因为OOM越限被杀死的时候,日志所记录的哪个进程的oom_score最高,它也就是导致Pod被杀死的罪魁祸首
3、解决问题
既然已经定位到具体业务容器问题了,解决办法要么提升Pod的资源限制,要么跟研发沟通进行优化,我选择了
后者
~
1、根据常见Exited Code定位Pod推出原因,可参考
2、journalctl知识1
3、journalctl知识2
4、内存OOM机制详解可以参考这里
5、重点参考 [问题已处理]-kubernetes中2次不同的oom处理