



點擊 登錄注冊 即表示同意《億速云用戶服務條款》

Kubernetes Resource QoS機制是什么

發布時間:2021-12-20 10:14:03 來源:億速云 閱讀:167 作者:iii 欄目:云計算

本篇內容主要講解“Kubernetes Resource QoS機制是什么”,感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷,實用性強。下面就讓小編來帶大家學習“Kubernetes Resource QoS機制是什么”吧!

Kubernetes Resource QoS Classes介紹

Kubernetes根據Pod中Containers Resource的requestlimit的值來定義Pod的QoS Class。

對于每一種Resource都可以將容器分為3中QoS Classes: Guaranteed, Burstable, and Best-Effort,它們的QoS級別依次遞減。

  • Guaranteed 如果Pod中所有Container的所有Resource的limitrequest都相等且不為0,則這個Pod的QoS Class就是Guaranteed。



    name: foo
                cpu: 10m
                memory: 1Gi
    name: bar
                cpu: 100m
                memory: 100Mi
    name: foo
                cpu: 10m
                memory: 1Gi
                cpu: 10m
                memory: 1Gi

    name: bar
                cpu: 100m
                memory: 100Mi
                cpu: 100m
                memory: 100Mi
  • Best-Effort 如果Pod中所有容器的所有Resource的request和limit都沒有賦值,則這個Pod的QoS Class就是Best-Effort.


    name: foo
    name: bar
  • Burstable 除了符合Guaranteed和Best-Effort的場景,其他場景的Pod QoS Class都屬于Burstable。

當limit值未指定時,其有效值其實是對應Node Resource的Capacity。



    name: foo
                cpu: 10m
                memory: 1Gi
                cpu: 10m
                memory: 1Gi

    name: bar


    name: foo
                memory: 1Gi

    name: bar
                cpu: 100m


    name: foo
                cpu: 10m
                memory: 1Gi

    name: bar


kube-scheduler調度時,是基于Pod的request值進行Node Select完成調度的。Pod和它的所有Container都不允許Consume limit指定的有效值(if have)。

How the request and limit are enforced depends on whether the resource is compressible or incompressible.

Compressible Resource Guarantees

  • For now, we are only supporting CPU.

  • Pods are guaranteed to get the amount of CPU they request, they may or may not get additional CPU time (depending on the other jobs running). This isn't fully guaranteed today because cpu isolation is at the container level. Pod level cgroups will be introduced soon to achieve this goal.

  • Excess CPU resources will be distributed based on the amount of CPU requested. For example, suppose container A requests for 600 milli CPUs, and container B requests for 300 milli CPUs. Suppose that both containers are trying to use as much CPU as they can. Then the extra 10 milli CPUs will be distributed to A and B in a 2:1 ratio (implementation discussed in later sections).

  • Pods will be throttled if they exceed their limit. If limit is unspecified, then the pods can use excess CPU when available.

Incompressible Resource Guarantees

  • For now, we are only supporting memory.

  • Pods will get the amount of memory they request, if they exceed their memory request, they could be killed (if some other pod needs memory), but if pods consume less memory than requested, they will not be killed (except in cases where system tasks or daemons need more memory).

  • When Pods use more memory than their limit, a process that is using the most amount of memory, inside one of the pod's containers, will be killed by the kernel.

Admission/Scheduling Policy

  • Pods will be admitted by Kubelet & scheduled by the scheduler based on the sum of requests of its containers. The scheduler & kubelet will ensure that sum of requests of all containers is within the node's allocatable capacity (for both memory and CPU).


  • CPU Pods will not be killed if CPU guarantees cannot be met (for example if system tasks or daemons take up lots of CPU), they will be temporarily throttled.

  • Memory Memory is an incompressible resource and so let's discuss the semantics of memory management a bit.

    • Best-Effort pods will be treated as lowest priority. Processes in these pods are the first to get killed if the system runs out of memory. These containers can use any amount of free memory in the node though.

    • Guaranteed pods are considered top-priority and are guaranteed to not be killed until they exceed their limits, or if the system is under memory pressure and there are no lower priority containers that can be evicted.

    • Burstable pods have some form of minimal resource guarantee, but can use more resources when available. Under system memory pressure, these containers are more likely to be killed once they exceed their requests and no Best-Effort pods exist.

OOM Score configuration at the Nodes

Pod OOM score configuration

  • Note that the OOM score of a process is 10 times the % of memory the process consumes, adjusted by OOM_SCORE_ADJ, barring exceptions (e.g. process is launched by root). Processes with higher OOM scores are killed.

  • The base OOM score is between 0 and 1000, so if process A’s OOM_SCORE_ADJ - process B’s OOM_SCORE_ADJ is over a 1000, then process A will always be OOM killed before B.

  • The final OOM score of a process is also between 0 and 1000


  • Set OOM_SCORE_ADJ: 1000

  • So processes in best-effort containers will have an OOM_SCORE of 1000


  • Set OOM_SCORE_ADJ: -998

  • So processes in guaranteed containers will have an OOM_SCORE of 0 or 1


  • If total memory request > 99.8% of available memory, OOM_SCORE_ADJ: 2

  • Otherwise, set OOM_SCORE_ADJ to 1000 - 10 * (% of memory requested)

  • This ensures that the OOM_SCORE of burstable pod is > 1

  • If memory request is 0, OOM_SCORE_ADJ is set to 999.

  • So burstable pods will be killed if they conflict with guaranteed pods

  • If a burstable pod uses less memory than requested, its OOM_SCORE < 1000

  • So best-effort pods will be killed if they conflict with burstable pods using less than requested memory

  • If a process in burstable pod's container uses more memory than what the container had requested, its OOM_SCORE will be 1000, if not its OOM_SCORE will be < 1000

  • Assuming that a container typically has a single big process, if a burstable pod's container that uses more memory than requested conflicts with another burstable pod's container using less memory than requested, the former will be killed

  • If burstable pod's containers with multiple processes conflict, then the formula for OOM scores is a heuristic, it will not ensure "Request and Limit" guarantees.

Pod infra containers or Special Pod init process

  • OOM_SCORE_ADJ: -998

Kubelet, Docker

  • OOM_SCORE_ADJ: -999 (won’t be OOM killed)

  • Hack, because these critical tasks might die if they conflict with guaranteed containers. In the future, we should place all user-pods into a separate cgroup, and set a limit on the memory they can consume.



上面討論的各個QoS Class對應的OOM_SCORE_ADJ定義在:


const (
	PodInfraOOMAdj        int = -998
	KubeletOOMScoreAdj    int = -999
	DockerOOMScoreAdj     int = -999
	KubeProxyOOMScoreAdj  int = -999
	guaranteedOOMScoreAdj int = -998
	besteffortOOMScoreAdj int = 1000



func GetContainerOOMScoreAdjust(pod *v1.Pod, container *v1.Container, memoryCapacity int64) int {
	switch GetPodQOS(pod) {
	case Guaranteed:
		// Guaranteed containers should be the last to get killed.
		return guaranteedOOMScoreAdj
	case BestEffort:
		return besteffortOOMScoreAdj

	// Burstable containers are a middle tier, between Guaranteed and Best-Effort. Ideally,
	// we want to protect Burstable containers that consume less memory than requested.
	// The formula below is a heuristic. A container requesting for 10% of a system's
	// memory will have an OOM score adjust of 900. If a process in container Y
	// uses over 10% of memory, its OOM score will be 1000. The idea is that containers
	// which use more than their request will have an OOM score of 1000 and will be prime
	// targets for OOM kills.
	// Note that this is a heuristic, it won't work if a container has many small processes.
	memoryRequest := container.Resources.Requests.Memory().Value()
	oomScoreAdjust := 1000 - (1000*memoryRequest)/memoryCapacity
	// A guaranteed pod using 100% of memory can have an OOM score of 10. Ensure
	// that burstable pods have a higher OOM score adjustment.
	if int(oomScoreAdjust) < (1000 + guaranteedOOMScoreAdj) {
		return (1000 + guaranteedOOMScoreAdj)
	// Give burstable pods a higher chance of survival over besteffort pods.
	if int(oomScoreAdjust) == besteffortOOMScoreAdj {
		return int(oomScoreAdjust - 1)
	return int(oomScoreAdjust)

獲取Pod的QoS Class的方法為:


// GetPodQOS returns the QoS class of a pod.
// A pod is besteffort if none of its containers have specified any requests or limits.
// A pod is guaranteed only when requests and limits are specified for all the containers and they are equal.
// A pod is burstable if limits and requests do not match across all containers.
func GetPodQOS(pod *v1.Pod) QOSClass {
	requests := v1.ResourceList{}
	limits := v1.ResourceList{}
	zeroQuantity := resource.MustParse("0")
	isGuaranteed := true
	for _, container := range pod.Spec.Containers {
		// process requests
		for name, quantity := range container.Resources.Requests {
			if !supportedQoSComputeResources.Has(string(name)) {
			if quantity.Cmp(zeroQuantity) == 1 {
				delta := quantity.Copy()
				if _, exists := requests[name]; !exists {
					requests[name] = *delta
				} else {
					requests[name] = *delta
		// process limits
		qosLimitsFound := sets.NewString()
		for name, quantity := range container.Resources.Limits {
			if !supportedQoSComputeResources.Has(string(name)) {
			if quantity.Cmp(zeroQuantity) == 1 {
				delta := quantity.Copy()
				if _, exists := limits[name]; !exists {
					limits[name] = *delta
				} else {
					limits[name] = *delta

		if len(qosLimitsFound) != len(supportedQoSComputeResources) {
			isGuaranteed = false
	if len(requests) == 0 && len(limits) == 0 {
		return BestEffort
	// Check is requests match limits for all resources.
	if isGuaranteed {
		for name, req := range requests {
			if lim, exists := limits[name]; !exists || lim.Cmp(req) != 0 {
				isGuaranteed = false
	if isGuaranteed &&
		len(requests) == len(limits) {
		return Guaranteed
	return Burstable


Kubernetes Resource QoS機制是什么

到此,相信大家對“Kubernetes Resource QoS機制是什么”有了更深的了解,不妨來實際操作一番吧!這里是億速云網站,更多相關內容可以進入相關頻道進行查詢,關注我們,繼續學習!




灵山县| 宝应县| 宝坻区| 浙江省| 眉山市| 宽甸| 广西| 五指山市| 哈密市| 耿马| 琼海市| 瑞安市| 宁夏| 云林县| 怀化市| 五寨县| 布尔津县| 永兴县| 盱眙县| 龙州县| 澄江县| 新源县| 津南区| 平定县| 苏尼特左旗| 湘乡市| 兴宁市| 宿州市| 二手房| 安塞县| 桂东县| 木兰县| 探索| 江津市| 林州市| 贵溪市| 卢氏县| 大庆市| 化州市| 大同市| 苍山县|