K8s Taint污點與Toleration容忍詳解
提示:kubernetes官網(wǎng)Taint與Toleration污點和容忍文檔說明 https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
Taint與Toleration污點和容忍是一種設計理念,Taint在K8s集群上任意節(jié)點打上污點,讓不能容忍這個污點的Pod不能部署在打了污點的K8s集群節(jié)點上。
Toleration是讓Pod容忍節(jié)點能部署到具有污點的K8s集群節(jié)點上,可以讓一些需要特殊配置的Pod能夠部署到具有污點的K8s集群節(jié)點。

一、Taint參數(shù)說明(節(jié)點配置)
創(chuàng)建一個污點(一個節(jié)點可以創(chuàng)建多個污點)
語法: kubectl taint nodes NODE_NAME TAINT_KEY=TAINT_VALUE:EFFECT (TAINT_KEY自定義名稱,TAINT_VALUE自定義值,EFFECT是指定的三種污點類型NoSchedule、NoExecute、PreferNoSchedule)
kubectl taint k8s-master01 ssd=true:PreferNoSchedule kubectl taint k8s-master01 k8s-master02 k8s-master03 ssd=true:PreferNoSchedule ### Taint參數(shù)說明 NoSchedule # 禁止調(diào)度到該節(jié)點,已經(jīng)在該節(jié)點上的Pod不受影響 NoExecute # 禁止調(diào)度到該節(jié)點,已經(jīng)部署在該節(jié)點上的Pod,如果不符合這個污點,Pod立刻會被該節(jié)點驅(qū)逐出去或過一段時間以后再被驅(qū)逐出去 PreferNoSchedule # 盡量避免將Pod調(diào)度到指定的節(jié)點上,如果沒有更合適的節(jié)點,可以部署到該節(jié)點
二、Toleration參數(shù)說明(Pod配置)
1、方式一:完全匹配
tolerations: - key: "TAINT_KEY" operator: "Equal" value: "TAINT_VALUE" effect: "NoSchedule"
2、方式二:不完全匹配
tolerations: - key: "TAINT_KEY" operator: "Exists" effect: "NoSchedule" # 或者 tolerations: - key: "TAINT_KEY" operator: "Equal" value: "TAINT_VALUE"
3、方式三:大范圍匹配(不推薦key配置成K8s集群的內(nèi)置Taint污點)
tolerations: - key: "TAINT_KEY" operator: "Exists" # 或者 tolerations: - effect: "NoSchedule" operator: "Exists"
4、方式四:匹配所有(不推薦)
tolerations: - operator: "Exists"
5、方式五:Toleration配置了NoExectue類型并且設置了tolerationSeconds參數(shù)代表Pod在規(guī)定時間內(nèi)自動退出具有該污點的節(jié)點
- 應用場景一:k8s集群如果某節(jié)點出現(xiàn)故障,Pod默認是300秒后退出該節(jié)點,可以利用tolerationSeconds參數(shù)讓Pod快速退出故障節(jié)點
- 應用場景二:k8s集群如果出現(xiàn)網(wǎng)絡抖動比較嚴重,可以延長容忍時長,默認300秒
tolerations: - key: "TAINT_KEY" operator: "Equal" value: "TAINT_VALUE" effect: "NoExectue" tolerationSeconds: 3600
三、Taint 常用命令
### 創(chuàng)建Taint污點,ssd=true自定義 kubectl taint nodes k8s-node01 ssd=true:NoExecute ### 查看節(jié)點Taint污點 kubectl describe node k8s-node01 | grep -A 3 Taints kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints ### 基于key刪除節(jié)點污點 kubectl taint nodes k8s-node01 ssd- ### 基于key + Effect刪除節(jié)點污點 kubectl taint nodes k8s-node01 ssd:NoExecute- ### 基于key + value + Effect刪除節(jié)點污點 kubectl taint nodes k8s-node01 ssd=true:NoExecute- ### 修改節(jié)點污點(只能修改value值) kubectl taint nodes k8s-node01 ssd=false:NoExecute --overwrite
四、Taint與Toleration案例
1、節(jié)點設置Taint污點NoSchedule類型(已部署在該節(jié)點上的Pod不會被驅(qū)逐)
### 查看k8s-node01節(jié)點已運行的Pod
kubectl get pods -A -owide | grep k8s-node01
### 在k8s-node01節(jié)點打上一個類型NoSchedule的污點
kubectl taint nodes k8s-node01 system=node:NoSchedule
### 查看k8s-node01和k8s-node02節(jié)點污點
kubectl describe node k8s-node01 k8s-node02 | grep -A 3 Taints
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 過300秒后再查看k8s-node01節(jié)點已運行的Pod是否會被驅(qū)逐,NoSchedule參數(shù)默認不驅(qū)逐已部署在該節(jié)點的Pod
kubectl get pods -A -owide | grep k8s-node01
### 創(chuàng)建Deployment(設置toleration容忍完全匹配taint污點健值對)
mkdir -p /data/yaml/taint
cat > /data/yaml/taint/nginx-deploy-noschedule.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deploy
name: nginx-deploy
namespace: default
spec:
replicas: 6
selector:
matchLabels:
app: nginx-pod
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26
name: nginx
tolerations:
- key: "system"
operator: "Equal"
value: "node"
effect: "NoSchedule"
EOF
kubectl create -f /data/yaml/taint/nginx-deploy-noschedule.yaml
### 查看此時Pod節(jié)點可以部署到k8s-node01節(jié)點上
kubectl get pods -n default -owide
### 刪除污點
kubectl taint nodes k8s-node01 system=node:NoSchedule-
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
2、節(jié)點設置Taint污點NoExecute類型(此時會驅(qū)逐沒有容忍該污點的Pod)
提示:如果calico-node沒有被驅(qū)逐出去,說明calico-node的Pod設置了tolerations容忍了類型NoExecute的污點,可以通過kubectl get pods calico-node-rzj4b -n kube-system -oyaml | egrep "effect|operator"查看calico-node容器是否設置了tolerations參數(shù)容忍
### 查看k8s-node02節(jié)點已運行的Pod
kubectl get pods -A -owide | grep k8s-node02
### 在 k8s-node02節(jié)點打上一個類型的NoExecute污點
kubectl taint nodes k8s-node02 disk=ssd:NoExecute
### 查看k8s-node02節(jié)點污點
kubectl describe node k8s-node02 | grep -A 3 Taints
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 過300秒后再查看 k8s-node02節(jié)點已運行的Pod是否被驅(qū)逐出去
kubectl get pods -A -owide | grep k8s-node02
### 創(chuàng)建Deployment(設置toleration容忍匹配所有taint污點NoExecute類型)
mkdir -p /data/yaml/taint
cat > /data/yaml/taint/nginx-deploy-noexecute.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deploy
name: nginx-deploy
namespace: default
spec:
replicas: 6
selector:
matchLabels:
app: nginx-pod
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26
name: nginx
tolerations:
- effect: "NoExecute"
operator: "Exists"
EOF
kubectl create -f /data/yaml/taint/nginx-deploy-noexecute.yaml
### 查看此時Pod節(jié)點可以部署到k8s-node02節(jié)點上
kubectl get pods -n default -owide
### 刪除節(jié)點污點
kubectl taint nodes k8s-node02 disk=ssd:NoExecute-
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
3、節(jié)點設置Taint污點NoSchedule類型并打標,Pod同時設置nodeSelector參數(shù)與tolerations參數(shù)
注意:如果Pod同時配置nodeSelector和tolerations參數(shù),nodeSelector參數(shù)優(yōu)先級大于tolerations參數(shù);如果tolerations容忍完全不匹配taint污點健值對,但又設置了nodeSelector參數(shù),此時Pod會因參數(shù)沖突導致Pod無法部署在nodeSelector參數(shù)指定的節(jié)點上
### 在k8s-node01和k8s-node02節(jié)點打上一個類型NoSchedule污點
kubectl taint nodes k8s-node01 k8s-node02 ssd=true:NoSchedule
### 查看k8s-node01和k8s-node02節(jié)點污點
kubectl describe node k8s-node01 k8s-node02 | grep -A 3 Taints
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 在k8s-node01、k8s-node02、k8s-node03節(jié)點打上一個標簽(disktype=ssd自定義)
kubectl label node k8s-node01 k8s-node02 k8s-node03 disktype=ssd
### 查看k8s集群節(jié)點標簽
kubectl get nodes --show-labels | grep disktype=ssd
kubectl get nodes --show-labels -l disktype=ssd
### 創(chuàng)建Deployment(Pod配置nodeSelector參數(shù)和tolerations參數(shù),使tolerations容忍完全匹配taint污點健值對)
mkdir -p /data/yaml/taint
cat > /data/yaml/taint/nginx-deploy-podnoschedule.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deploy
name: nginx-deploy
spec:
replicas: 6
selector:
matchLabels:
app: nginx-pod
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26
name: nginx
nodeSelector:
disktype: "ssd"
tolerations:
- key: "ssd"
operator: "Equal"
value: "true"
effect: "NoSchedule"
EOF
kubectl create -f /data/yaml/taint/nginx-deploy-podnoschedule.yaml
### 查看此時Pod節(jié)點可以部署到k8s-node01 k8s-node02、k8s-node03節(jié)點上
kubectl get pods -n default -owide
### 刪除節(jié)點污點
kubectl taint nodes k8s-node01 k8s-node02 ssd=true:NoSchedule-
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 刪除節(jié)點標簽
kubectl label nodes k8s-node01 k8s-node02 k8s-node03 disktype-
kubectl get nodes --show-labels -l disktype=ssd
4、節(jié)點設置NoExecute污點并打標,Pod同時設置nodeSelector參數(shù)與tolerations參數(shù)
注意:如果Pod同時配置nodeSelector和tolerations參數(shù),nodeSelector參數(shù)優(yōu)先級大于tolerations參數(shù);如果tolerations容忍完全不匹配taint污點健值對,但又設置了nodeSelector參數(shù),此時Pod會因參數(shù)沖突導致Pod無法部署在nodeSelector參數(shù)指定的節(jié)點上
### 在k8s-node01和k8s-node02節(jié)點打上一個類型NoExecute污點
kubectl taint nodes k8s-node01 k8s-node02 ssd=true:NoExecute
### 查看k8s-node01和k8s-node02節(jié)點污點
kubectl describe node k8s-node01 k8s-node02 | grep -A 3 Taints
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 在k8s-node01、k8s-node02、k8s-node03節(jié)點打上一個標簽(disktype=ssd自定義)
kubectl label node k8s-node01 k8s-node02 k8s-node03 disktype=ssd
### 查看k8s集群節(jié)點標簽
kubectl get nodes --show-labels | grep disktype=ssd
kubectl get nodes --show-labels -l disktype=ssd
### 創(chuàng)建Deployment的yaml文件,配置nodeSelector參數(shù)和tolerations參數(shù)
cat > /data/yaml/nginx-deploy.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deploy
name: nginx-deploy
spec:
replicas: 6
selector:
matchLabels:
app: nginx-deploy
template:
metadata:
labels:
app: nginx-deploy
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26
name: nginx
nodeSelector:
disktype: "ssd"
tolerations:
- key: "ssd"
operator: "Equal"
value: "true"
effect: "NoExecute"
EOF
### 創(chuàng)建Deployment(Pod配置nodeSelector參數(shù)和tolerations參數(shù),使tolerations容忍完全匹配taint污點健值對)
mkdir -p /data/yaml/taint
cat > /data/yaml/taint/nginx-deploy-podnoexecute.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deploy
name: nginx-deploy
spec:
replicas: 6
selector:
matchLabels:
app: nginx-pod
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26
name: nginx
nodeSelector:
disktype: "ssd"
tolerations:
- key: "ssd"
operator: "Equal"
value: "true"
effect: "NoExecute"
EOF
kubectl create -f /data/yaml/taint/nginx-deploy-podnoexecute.yaml
### 查看此時Pod節(jié)點可以部署到k8s-node01 k8s-node02、k8s-node03節(jié)點上
kubectl get pods -n default -owide
### 刪除節(jié)點污點
kubectl taint nodes k8s-node01 k8s-node02 ssd=true:NoSchedule-
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 刪除節(jié)點標簽
kubectl label nodes k8s-node01 k8s-node02 k8s-node03 disktype-
kubectl get nodes --show-labels -l disktype=ssd
五、K8s集群內(nèi)置Taint和Pod默認Ttoleration策略
1、k8s集群內(nèi)置Taint污點
node.kubernetes.io/not-ready # 節(jié)點未準備好,相當于節(jié)點狀態(tài)Ready的值為False node.kubernetes.io/unreachable # Node Controller訪問不到節(jié)點,相當于節(jié)點狀態(tài)Ready值變?yōu)閁nknown值 node.kubernetes.io/out-of-disk # 節(jié)點磁盤耗盡 node.kubernetes.io/memory-pressure # 節(jié)點存在內(nèi)存壓力 node.kubernetes.io/disk-pressure # 節(jié)點存在薇盤壓力 node.kubernetes.io/network-unavailable # 節(jié)點網(wǎng)終不可達 node.kubernetes.io/unschedulable # 節(jié)點不可調(diào)度 node.cloudprovider.kubernetes.io/uninitialized # 如果Kubelet啟動時指定了一個外部的cloudprovider,它將給當前節(jié)點添加一個Taint將其標記為不可用。在coud-controller-manager的一個controller初始化這個節(jié)點后,Kubelet將刪除這個Taint
2、創(chuàng)建Pod默認的Ttoleration容忍策略
### 創(chuàng)建Deployment
kubectl create deploy nginx-deploy --image=registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26 -n default
### 查看Deployment和Pod
kubectl get deploy -n default
kubectl get pods -n default
### 查看Pod默認的Toleration容忍策略
kubectl get pods nginx-deploy-6988f8548f-swkfv -n default -oyaml
..............................
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready # 節(jié)點未準備好,相當于節(jié)點狀態(tài)Ready的值為False
operator: Exists
tolerationSeconds: 300 # 默認5分鐘后強制驅(qū)逐已存在的Pod
- effect: NoExecute
key: node.kubernetes.io/unreachable # Node Controller訪問不到節(jié)點,相當于節(jié)點狀態(tài)Ready值變?yōu)閁nknown值
operator: Exists
tolerationSeconds: 300 # 默認5分鐘后強制驅(qū)逐已存在的Pod
..............................
六、模擬K8s集群節(jié)點宕機快速遷移Pod
### 節(jié)點宕機由狀態(tài)Ready值變?yōu)閁nknown(False)值中間是由kube-controller-manager服務--node-monitor-grace-period參數(shù)設置檢查時長
[root@k8s-master01 ~]# cat /usr/lib/systemd/system/kube-controller-manager.service | grep "node-monitor-grace-period"
--node-monitor-grace-period=40s \
### 在k8s-node01和k8s-node02節(jié)點打上一個類型NoExecute污點和一個標簽
kubectl taint nodes k8s-node01 k8s-node02 ssd=true:NoExecute
kubectl label nodes k8s-node01 k8s-node02 disktype=ssd
### 查看k8s-node01和k8s-node02節(jié)點污點和標簽
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
kubectl get nodes --show-labels -l disktype=ssd
### 創(chuàng)建Deployment(Nginx容器配置當k8s-node02節(jié)點宕機toleration容忍10秒后自動遷移到別的節(jié)點)
mkdir -p /data/yaml/taint
cat > /data/yaml/taint/nginx-deploy.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-deploy
name: nginx-deploy
spec:
replicas: 1
selector:
matchLabels:
app: nginx-pod
template:
metadata:
labels:
app: nginx-pod
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/nginx:1.26
name: nginx
nodeSelector:
disktype: "ssd"
tolerations:
- key: "ssd"
operator: "Equal"
value: "true"
effect: "NoExecute"
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 10
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 10
EOF
kubectl create -f /data/yaml/taint/nginx-deploy.yaml
kubectl get pods -owide -n default
### 創(chuàng)建Deployment(redis容器配置toleration容忍,但使用內(nèi)置的容忍時間,默認300秒)
cat > /data/yaml/taint/redis-deploy.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: redis-deploy
name: redis-deploy
spec:
replicas: 1
selector:
matchLabels:
app: redis-pod
template:
metadata:
labels:
app: redis-pod
spec:
containers:
- image: registry.cn-shenzhen.aliyuncs.com/dockerghost/redis:latest
name: redis
nodeSelector:
disktype: "ssd"
tolerations:
- key: "ssd"
operator: "Equal"
value: "true"
effect: "NoExecute"
EOF
kubectl create -f /data/yaml/taint/redis-deploy.yaml
kubectl get pods -owide -n default
### 此時nginx容器和redis容器正好都部署在k8s-node02節(jié)點上
[root@k8s-master01 ~]# kubectl get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deploy-67745bdcf8-jfsst 1/1 Running 0 2m31s 172.30.58.244 k8s-node02 <none> <none>
redis-deploy-6d549cd6bd-xdb5x 1/1 Running 0 50s 172.30.58.245 k8s-node02 <none> <none>
### 登錄k8s-node02節(jié)點關機模擬異常宕機
init 0
### 登錄master管理節(jié)點查看節(jié)點狀態(tài),此時k8s-node02節(jié)點狀態(tài)變成了NotReady
[root@k8s-master01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready <none> 14d v1.28.15
k8s-master02 Ready <none> 14d v1.28.15
k8s-master03 Ready <none> 14d v1.28.15
k8s-node01 Ready <none> 14d v1.28.15
k8s-node02 NotReady <none> 14d v1.28.15
### 登錄master管理節(jié)點查看Pod狀態(tài),因為redis容器比nginx容器容忍時長,所以nginx容器遷移到k8s-node02節(jié)點,而redis容器還在等待容忍時長
[root@k8s-master01 ~]# kubectl get pods -owide -n default
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deploy-67745bdcf8-jfsst 1/1 Terminating 0 4m20s 172.30.58.244 k8s-node02 <none> <none>
nginx-deploy-67745bdcf8-np7hl 1/1 Running 0 19s 172.30.85.203 k8s-node01 <none> <none>
redis-deploy-6d549cd6bd-xdb5x 1/1 Running 0 2m39s 172.30.58.245 k8s-node02 <none> <none>
### 刪除節(jié)點污點
kubectl taint nodes k8s-node01 k8s-node02 ssd=true:NoExecute-
kubectl get nodes -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
### 刪除節(jié)點標簽
kubectl label nodes k8s-node01 k8s-node02 disktype-
kubectl get nodes --show-labels -l disktype=ssd
總結
以上為個人經(jīng)驗,希望能給大家一個參考,也希望大家多多支持腳本之家。
相關文章
k8s 中的 service 如何找到綁定的 Pod 及實現(xiàn) 
service 是一組具有相同 label pod 集合的抽象,集群內(nèi)外的各個服務可以通過 service 進行互相通信,這篇文章主要介紹了k8s 中的 service 如何找到綁定的 Pod 以及如何實現(xiàn) Pod 負載均衡,需要的朋友可以參考下2022-10-10
K8S?實用工具之合并多個kubeconfig實現(xiàn)詳解
這篇文章主要為大家介紹了K8S?實用工具之合并多個kubeconfig實現(xiàn)詳解,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2023-03-03
IoT?邊緣集群Kubernetes?Events告警通知進一步配置詳解
這篇文章主要為大家介紹了IoT?邊緣集群Kubernetes?Events告警通知進一步配置詳解,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2023-02-02
Windows下安裝并使用kubectl查看K8S日志的操作方法
本文給大家介紹Windows下安裝并使用kubectl查看K8S日志的操作方法,本文給大家介紹的非常詳細,對大家的學習或工作具有一定的參考借鑒價值,需要的朋友安康下吧2025-06-06

