From e2acc03c69f38b466237d0c021a1ca315c3e1d89 Mon Sep 17 00:00:00 2001 From: yangchuanhu Date: Tue, 21 May 2019 10:20:44 +0800 Subject: [PATCH] =?UTF-8?q?=E6=B7=BB=E5=8A=A0=E9=9B=86=E7=BE=A4=E5=8D=87?= =?UTF-8?q?=E7=BA=A7=E6=96=87=E6=A1=A3?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- SUMMARY.md | 3 + ...06\347\276\244\347\216\257\345\242\203.md" | 11 +- ...24.Deployment \344\275\277\347\224\250.md" | 2 +- docs/36.Jenkins Slave.md | 3 +- docs/37.Jenkins Pipeline.md | 6 + docs/65.Gitlab CI.md | 8 + docs/66.devops.md | 8 +- "docs/67.Upgrade\351\233\206\347\276\244.md" | 277 ++++++++++++++++++ 8 files changed, 307 insertions(+), 11 deletions(-) create mode 100644 "docs/67.Upgrade\351\233\206\347\276\244.md" diff --git a/SUMMARY.md b/SUMMARY.md index f9d4c65..2eb1cbc 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -101,3 +101,6 @@ * [Gitlab CI](docs/65.Gitlab CI.md) * [Devops](docs/66.devops.md) + +### 其他: +* [集群升级](docs/67.Upgrade集群.md) diff --git "a/docs/16.\347\224\250 kubeadm \346\220\255\345\273\272\351\233\206\347\276\244\347\216\257\345\242\203.md" "b/docs/16.\347\224\250 kubeadm \346\220\255\345\273\272\351\233\206\347\276\244\347\216\257\345\242\203.md" index 628c193..6077038 100644 --- "a/docs/16.\347\224\250 kubeadm \346\220\255\345\273\272\351\233\206\347\276\244\347\216\257\345\242\203.md" +++ "b/docs/16.\347\224\250 kubeadm \346\220\255\345\273\272\351\233\206\347\276\244\347\216\257\345\242\203.md" @@ -133,9 +133,10 @@ gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg EOF ``` -目前阿里云的源最新版本已经是**1.10.2**版本,所以可以直接安装,由于我们上面的相关镜像是关联的1.10版本,所以我们安装的时候需要指定版本。yum 源配置完成后,执行安装命令即可: +> **注意:**由于阿里云的源将依赖进行了更改,如果你需要安装 1.10.0 版本的集群的话,需要使用下面的命令: + ```shell -$ yum makecache fast && yum install -y kubelet-1.10.0-0 kubeadm-1.10.0-0 kubectl-1.10.0-0 +$ yum makecache fast && yum install -y kubelet-1.10.0-0 kubeadm-1.10.0-0 kubectl-1.10.0-0 kubernetes-cni-0.6.0-0.x86_64.rpm ``` 正常情况我们可以都能顺利安装完成上面的文件。 @@ -252,10 +253,10 @@ e9ca4d9550e698105f1d8fae7ecfd297dd9331ca7d50b5493fa0491b2b4df40c 另外还需要注意的是当前版本的 kubeadm 支持的docker版本最大是 17.03,所以要注意下。 上面的信息记录了 kubeadm 初始化整个集群的过程,生成相关的各种证书、kubeconfig 文件、bootstraptoken 等等,后边是使用`kubeadm join`往集群中添加节点时用到的命令,下面的命令是配置如何使用kubectl访问集群的方式: ```shell -mkdir -p $HOME/.kube -sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config +mkdir -p $HOME/.kube +sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config -``` +``` 最后给出了将节点加入集群的命令: ```shell diff --git "a/docs/24.Deployment \344\275\277\347\224\250.md" "b/docs/24.Deployment \344\275\277\347\224\250.md" index 58228ee..f8ca5ff 100644 --- "a/docs/24.Deployment \344\275\277\347\224\250.md" +++ "b/docs/24.Deployment \344\275\277\347\224\250.md" @@ -95,7 +95,7 @@ strategy: * minReadySeconds: * Kubernetes在等待设置的时间后才进行升级 * 如果没有设置该值,Kubernetes会假设该容器启动起来后就提供服务了 - * 如果没有设置该值,在某些极端情况下可能会造成服务服务正常运行 + * 如果没有设置该值,在某些极端情况下可能会造成服务不正常运行 * maxSurge: * 升级过程中最多可以比原先设置多出的POD数量 * 例如:maxSurage=1,replicas=5,则表示Kubernetes会先启动1一个新的Pod后才删掉一个旧的POD,整个升级过程中最多会有5+1个POD。 diff --git a/docs/36.Jenkins Slave.md b/docs/36.Jenkins Slave.md index e3401a8..37bf7d2 100644 --- a/docs/36.Jenkins Slave.md +++ b/docs/36.Jenkins Slave.md @@ -310,7 +310,8 @@ Jenkins 安装完成了,接下来我们不用急着就去使用,我们要了 另外一些同学在配置了后运行 Slave Pod 的时候出现了权限问题,因为 Jenkins Slave Pod 中没有配置权限,所以需要配置上 ServiceAccount,在 Slave Pod 配置的地方点击下面的高级,添加上对应的 ServiceAccount 即可: -![kubernetes plugin config5](https://ws1.sinaimg.cn/large/006tNc79gy1g2qkhzyw0zj30kn0ebq4g.jpg) +![kubernetes plugin config5](https://bxdc-static.oss-cn-beijing.aliyuncs.com/images/jenkins-k8s-config5.png) + 到这里我们的 Kubernetes Plugin 插件就算配置完成了。 diff --git a/docs/37.Jenkins Pipeline.md b/docs/37.Jenkins Pipeline.md index 7a99d72..1b10a3b 100644 --- a/docs/37.Jenkins Pipeline.md +++ b/docs/37.Jenkins Pipeline.md @@ -231,6 +231,9 @@ spec: - image: cnych/jenkins-demo: imagePullPolicy: IfNotPresent name: jenkins-demo + env: + - name: branch + value: ``` 对于 Kubernetes 比较熟悉的同学,对上面这个 YAML 文件一定不会陌生,我们使用一个 Deployment 资源对象来管理 Pod,该 Pod 使用的就是我们上面推送的镜像,唯一不同的地方是 Docker 镜像的 tag 不是我们平常见的具体的 tag,而是一个 的标识,实际上如果我们将这个标识替换成上面的 Docker 镜像的 tag,是不是就是最终我们本次构建需要使用到的镜像?怎么替换呢?其实也很简单,我们使用一个**sed**命令就可以实现了: @@ -238,6 +241,7 @@ spec: stage('YAML') { echo "5. Change YAML File Stage" sh "sed -i 's//${build_tag}/' k8s.yaml" + sh "sed -i 's//${env.BRANCH_NAME}/' k8s.yaml" } ``` @@ -274,6 +278,7 @@ stage('YAML') { ) echo "This is a deploy step to ${userInput.Env}" sh "sed -i 's//${build_tag}/' k8s.yaml" + sh "sed -i 's//${env.BRANCH_NAME}/' k8s.yaml" } ``` @@ -331,6 +336,7 @@ node('haimaxy-jnlp') { ) echo "This is a deploy step to ${userInput}" sh "sed -i 's//${build_tag}/' k8s.yaml" + sh "sed -i 's//${env.BRANCH_NAME}/' k8s.yaml" if (userInput == "Dev") { // deploy dev stuff } else if (userInput == "QA"){ diff --git a/docs/65.Gitlab CI.md b/docs/65.Gitlab CI.md index 595fcc7..21b5c07 100644 --- a/docs/65.Gitlab CI.md +++ b/docs/65.Gitlab CI.md @@ -135,9 +135,17 @@ gitlab-runner@gitlab-ci-runner-0:/$ gitlab-ci-multi-runner register --help --kubernetes-helper-cpu-limit value The CPU allocation given to build helper containers (default: "500m") [$KUBERNETES_HELPER_CPU_LIMIT] --kubernetes-helper-memory-limit value The amount of memory allocated to build helper containers (default: "3Gi") [$KUBERNETES_HELPER_MEMORY_LIMIT] --kubernetes-cpu-request value The CPU allocation requested for build containers [$KUBERNETES_CPU_REQUEST] +... +--pre-clone-script value Runner-specific command script executed before code is pulled [$RUNNER_PRE_CLONE_SCRIPT] [...] ``` +如果定义的 Gitlab 域名并不是通过外网的 DNS 进行解析的,而是通过 /etc/hosts 俩进行映射的,那么我们就需要在 runner 的 Pod 中去添加 git.qikqiak.com 对应的 hosts 了,那么如何添加呢?我们可以想到的是 Pod 的 hostAlias 可以实现这个需求,但是 runner 的 Pod 是自动生成的,没办法直接去定义 hostAlias。这里我们就可以通过上面的`--pre-clone-script`参数来指定一段脚本来添加 hosts 信息,也就是在上面的 ConfigMap 中添加环境变量`RUNNER_PRE_CLONE_SCRIPT`的值即可: + +```yaml +RUNNER_PRE_CLONE_SCRIPT = "echo 'xx.xx.xxx.xx git.qikqiak.com' >> /etc/hosts" +``` + 除了上面的一些环境变量相关的配置外,还需要一个用于注册、运行和取消注册 Gitlab CI Runner 的小脚本。只有当 Pod 正常通过 Kubernetes(TERM信号)终止时,才会触发转轮取消注册。 如果强制终止 Pod(SIGKILL信号),Runner 将不会注销自身。必须手动完成对这种**被杀死的** Runner 的清理,配置清单文件如下:(runner-scripts-cm.yaml) ```yaml diff --git a/docs/66.devops.md b/docs/66.devops.md index 9bd846d..104aed2 100644 --- a/docs/66.devops.md +++ b/docs/66.devops.md @@ -325,7 +325,7 @@ slave-6e898009-62a2-4798-948f-9c80c3de419b-0jwml-6t6hb 5/5 Running 0 第三个阶段:构建 Docker 镜像,要构建 Docker 镜像,就需要提供镜像的名称和 tag,要推送到 Harbor 仓库,就需要提供登录的用户名和密码,所以我们这里使用到了`withCredentials`方法,在里面可以提供一个`credentialsId`为`dockerhub`的认证信息,如下: ```groovy -container('构建 Docker 镜像') { +stage('构建 Docker 镜像') { withCredentials([[$class: 'UsernamePasswordMultiBinding', credentialsId: 'dockerhub', usernameVariable: 'DOCKER_HUB_USER', @@ -692,10 +692,10 @@ def helmDeploy(Map args) { if (args.dry_run) { println "Debug 应用" - sh "helm upgrade --dry-run --debug --install ${args.name} ${args.chartDir} --set persistence.persistentVolumeClaim.database.storageClass=database --set database.type=internal --set database.internal.database=polling --set database.internal.username=polling --set database.internal.password=polling321 --set api.image.repository=${args.image} --set api.image.tag=${args.tag} --set imagePullSecrets[0].name=myreg --namespace=${args.namespace}" + sh "helm upgrade --dry-run --debug --install ${args.name} ${args.chartDir} --set persistence.persistentVolumeClaim.database.storageClass=database --set api.image.repository=${args.image} --set api.image.tag=${args.tag} --set imagePullSecrets[0].name=myreg --namespace=${args.namespace}" } else { println "部署应用" - sh "helm upgrade --install ${args.name} ${args.chartDir} --set persistence.persistentVolumeClaim.database.storageClass=database --set database.type=internal --set database.internal.database=polling --set database.internal.username=polling --set database.internal.password=polling321 --set api.image.repository=${args.image} --set api.image.tag=${args.tag} --set imagePullSecrets[0].name=myreg --namespace=${args.namespace}" + sh "helm upgrade --install ${args.name} ${args.chartDir} --set persistence.persistentVolumeClaim.database.storageClass=database --set api.image.repository=${args.image} --set api.image.tag=${args.tag} --set imagePullSecrets[0].name=myreg --namespace=${args.namespace}" echo "应用 ${args.name} 部署成功. 可以使用 helm status ${args.name} 查看应用状态" } } @@ -733,7 +733,7 @@ podTemplate(label: label, containers: [ throw(exc) } } - container('构建 Docker 镜像') { + stage('构建 Docker 镜像') { withCredentials([[$class: 'UsernamePasswordMultiBinding', credentialsId: 'dockerhub', usernameVariable: 'DOCKER_HUB_USER', diff --git "a/docs/67.Upgrade\351\233\206\347\276\244.md" "b/docs/67.Upgrade\351\233\206\347\276\244.md" new file mode 100644 index 0000000..81a3a01 --- /dev/null +++ "b/docs/67.Upgrade\351\233\206\347\276\244.md" @@ -0,0 +1,277 @@ +# 67. 集群升级 + +由于课程中的集群版本是 v1.10.0,这个版本相对有点旧了,最新版本都已经 v1.14.x 了,为了尽量保证课程内容的更新度,所以我们需要将集群版本更新。我们的集群是使用的 kubeadm 搭建的,我们知道使用 kubeadm 搭建的集群来更新是非常方便的,但是由于我们这里版本跨度太大,不能直接从 1.10.x 更新到 1.14.x,kubeadm 的更新是不支持跨多个主版本的,所以我们现在是 1.10,只能更新到 1.11 版本了,然后再重 1.11 更新到 1.12...... 不过版本更新的方式方法基本上都是一样的,所以后面要更新的话也挺简单了,下面我们就先将集群更新到 v1.11.0 版本。 + + +## 更新集群 +首先我们保留 kubeadm config 文件: +```shell +$ kubeadm config view +api: + advertiseAddress: 10.151.30.11 + bindPort: 6443 + controlPlaneEndpoint: "" +auditPolicy: + logDir: /var/log/kubernetes/audit + logMaxAge: 2 + path: "" +authorizationModes: +- Node +- RBAC +certificatesDir: /etc/kubernetes/pki +cloudProvider: "" +criSocket: /var/run/dockershim.sock +etcd: + caFile: "" + certFile: "" + dataDir: /var/lib/etcd + endpoints: null + image: "" + keyFile: "" +imageRepository: k8s.gcr.io +kubeProxy: + config: + bindAddress: 0.0.0.0 + clientConnection: + acceptContentTypes: "" + burst: 10 + contentType: application/vnd.kubernetes.protobuf + kubeconfig: /var/lib/kube-proxy/kubeconfig.conf + qps: 5 + clusterCIDR: 10.244.0.0/16 + configSyncPeriod: 15m0s + conntrack: + max: null + maxPerCore: 32768 + min: 131072 + tcpCloseWaitTimeout: 1h0m0s + tcpEstablishedTimeout: 24h0m0s + enableProfiling: false + healthzBindAddress: 0.0.0.0:10256 + hostnameOverride: "" + iptables: + masqueradeAll: false + masqueradeBit: 14 + minSyncPeriod: 0s + syncPeriod: 30s + ipvs: + minSyncPeriod: 0s + scheduler: "" + syncPeriod: 30s + metricsBindAddress: 127.0.0.1:10249 + mode: "" + nodePortAddresses: null + oomScoreAdj: -999 + portRange: "" + resourceContainer: /kube-proxy + udpIdleTimeout: 250ms +kubeletConfiguration: {} +kubernetesVersion: v1.10.0 +networking: + dnsDomain: cluster.local + podSubnet: 10.244.0.0/16 + serviceSubnet: 10.96.0.0/12 +nodeName: ydzs-master +privilegedPods: false +token: "" +tokenGroups: +- system:bootstrappers:kubeadm:default-node-token +tokenTTL: 24h0m0s +tokenUsages: +- signing +- authentication +unifiedControlPlaneImage: "" +``` + +将上面的imageRepository值更改为:gcr.azk8s.cn/google_containers,然后保存内容到文件 kubeadm-config.yaml 中(当然如果你的集群可以获取到 grc.io 的镜像可以不用更改)。 + +然后更新 kubeadm: +```shell +$ yum makecache fast && yum install -y kubeadm-1.11.0-0 kubectl-1.11.0-0 +$ kubeadm version +kubeadm version: &version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:14:41Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"} +``` + +> 因为 kubeadm upgrade plan 命令执行过程中会去 dl.k8s.io 获取版本信息,这个地址是需要科学方法才能访问的,所以我们可以先将 kubeadm 更新到目标版本,然后就可以查看到目标版本升级的一些信息了。 + +执行 upgrade plan 命令查看是否可以升级: +```shell +$ kubeadm upgrade plan +[preflight] Running pre-flight checks. +[upgrade] Making sure the cluster is healthy: +[upgrade/config] Making sure the configuration is correct: +[upgrade/config] Reading configuration from the cluster... +[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' +I0518 18:50:12.844665 9676 feature_gate.go:230] feature gates: &{map[]} +[upgrade] Fetching available versions to upgrade to +[upgrade/versions] Cluster version: v1.10.0 +[upgrade/versions] kubeadm version: v1.11.0 +[upgrade/versions] WARNING: Couldn't fetch latest stable version from the internet: unable to get URL "https://dl.k8s.io/release/stable.txt": Get https://dl.k8s.io/release/stable.txt: dial tcp 35.201.71.162:443: i/o timeout +[upgrade/versions] WARNING: Falling back to current kubeadm version as latest stable version +[upgrade/versions] WARNING: Couldn't fetch latest version in the v1.10 series from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.10.txt": Get https://dl.k8s.io/release/stable-1.10.txt: dial tcp 35.201.71.162:443: i/o timeout + +Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply': +COMPONENT CURRENT AVAILABLE +Kubelet 3 x v1.10.0 v1.11.0 + +Upgrade to the latest stable version: + +COMPONENT CURRENT AVAILABLE +API Server v1.10.0 v1.11.0 +Controller Manager v1.10.0 v1.11.0 +Scheduler v1.10.0 v1.11.0 +Kube Proxy v1.10.0 v1.11.0 +CoreDNS 1.1.3 +Kube DNS 1.14.8 +Etcd 3.1.12 3.2.18 + +You can now apply the upgrade by executing the following command: + + kubeadm upgrade apply v1.11.0 + +_____________________________________________________________________ + +``` + + +我们可以先使用 dry-run 命令查看升级信息: +```shell +$ kubeadm upgrade apply v1.11.0 --config kubeadm-config.yaml --dry-run +``` + +> 注意要通过`--config`指定上面保存的配置文件,该配置文件信息包含了上一个版本的集群信息以及修改搞得镜像地址。 + +查看了上面的升级信息确认无误后就可以执行升级操作了: +```shell +$ kubeadm upgrade apply v1.11.0 --config kubeadm-config.yaml +kubeadm upgrade apply v1.11.0 --config kubeadm-config.yaml +[preflight] Running pre-flight checks. +I0518 18:57:29.134722 12284 feature_gate.go:230] feature gates: &{map[]} +[upgrade] Making sure the cluster is healthy: +[upgrade/config] Making sure the configuration is correct: +[upgrade/config] Reading configuration options from a file: kubeadm-config.yaml +I0518 18:57:29.179231 12284 feature_gate.go:230] feature gates: &{map[]} +[upgrade/apply] Respecting the --cri-socket flag that is set with higher priority than the config file. +[upgrade/version] You have chosen to change the cluster version to "v1.11.0" +[upgrade/versions] Cluster version: v1.10.0 +[upgrade/versions] kubeadm version: v1.11.0 +[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y +[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd] +[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.11.0"... +Static pod: kube-apiserver-ydzs-master hash: 3abd7df4382a9b60f60819f84de40e11 +Static pod: kube-controller-manager-ydzs-master hash: 1a0f3ccde96238d31012390b61109573 +Static pod: kube-scheduler-ydzs-master hash: 2acb197d598c4730e3f5b159b241a81b + + +``` + +隔一段时间看到如下信息就证明集群升级成功了: +```shell +...... +[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token +[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster +[addons] Applied essential addon: CoreDNS + + +[addons] Applied essential addon: kube-proxy + +[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.11.0". Enjoy! + +[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so. +``` + + +由于上面我们已经更新过 kubectl 了,现在我们用 kubectl 来查看下版本信息: +```shell +$ kubectl version +Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"} +Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"} +``` + +可以看到现在 Server 端和 Client 端都已经是 v1.11.0 版本了,然后查看下 Pod 信息: +```shell +$ kubectl get pods -n kube-system +NAME READY STATUS RESTARTS AGE +authproxy-oauth2-proxy-798cff85fc-pc8x5 1/1 Running 0 34d +cert-manager-796fb45d79-wcrfp 1/1 Running 2 34d +coredns-7f6746b7f-2cs2x 1/1 Running 0 5m +coredns-7f6746b7f-clphf 1/1 Running 0 5m +etcd-ydzs-master 1/1 Running 0 10m +kube-apiserver-ydzs-master 1/1 Running 0 7m +kube-controller-manager-ydzs-master 1/1 Running 0 7m +kube-flannel-ds-amd64-jxzq9 1/1 Running 8 64d +kube-flannel-ds-amd64-r56r9 1/1 Running 3 64d +kube-flannel-ds-amd64-xw9fx 1/1 Running 2 64d +kube-proxy-gqvdg 1/1 Running 0 3m +kube-proxy-sn7xb 1/1 Running 0 3m +kube-proxy-vbrr7 1/1 Running 0 2m +kube-scheduler-ydzs-master 1/1 Running 0 6m +nginx-ingress-controller-587b4c68bf-vsqgm 1/1 Running 2 34d +nginx-ingress-default-backend-64fd9fd685-lmxhw 1/1 Running 1 34d +tiller-deploy-847cfb9744-5cvh8 1/1 Running 0 4d +``` + +## 更新 kubelet + +可以看到我们之前的 kube-dns 服务已经被 coredns 取代了,这是因为在 v1.11.0 版本后就默认使用 coredns 了,我们也可以访问下集群中的服务看是否有影响,然后查看下集群的 Node 信息: +```shell +$ kubectl get nodes +NAME STATUS ROLES AGE VERSION +ydzs-master Ready master 64d v1.10.0 +ydzs-node1 Ready 64d v1.10.0 +ydzs-node2 Ready 64d v1.10.0 +``` + +可以看到版本并没有更新,这是因为节点上的 kubelet 还没有更新的,我们可以通过 kubelet 查看下版本: +```shell +$ kubelet --version +Kubernetes v1.10.0 +``` + +这个时候我们去手动更新下 kubelet: +```shell +$ yum install -y kubelet-1.11.0-0 +# 安装完成后查看下版本 +$ kubelet --version +Kubernetes v1.11.0 +# 然后重启 kubelet 服务 +$ systemctl daemon-reload +$ systemctl restart kubelet +$ kubectl get nodes +NAME STATUS ROLES AGE VERSION +ydzs-master Ready master 64d v1.11.0 +ydzs-node1 Ready 64d v1.10.0 +ydzs-node2 Ready 64d v1.10.0 +``` + +注意事项: + +* 如果节点上 swap 没有关掉重启 kubelet 服务会报错,所以最好是关掉 swap,执行命令:`swapoff -a` 即可。 +* 1.11.0 版本的 kubelet 默认使用的`pod-infra-container-image`镜像名称为:`k8s.gcr.io/pause:3.1`,所以最好先提前查看下集群节点上是否有这个镜像,因为我们之前 1.10.0 版本的集群默认的名字为`k8s.gcr.io/pause-amd64:3.1`,所以如果节点上还是之前的 pause 镜像的话,需要先重新打下镜像 tag: + +```shell +$ docker tag k8s.gcr.io/pause-amd64:3.1 k8s.gcr.io/pause:3.1 +``` + +没有的话可以提前下载到节点上也可以通过配置参数进行指定,在文件`/var/lib/kubelet/kubeadm-flags.env`中添加如下参数信息: + +```shell +KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni --pod-infra-container-image=cnych/pause-amd64:3.1 +``` + + +可以看到我们更新了 kubelet 的节点版本信息已经更新了,同样的方式去把另外两个节点 kubelet 更新即可。 + +> 另外需要注意的是最好在节点上的 kubelet 更新之前将节点设置为不可调度,更新完成后再设置回来,可以避免不必要的错误。 + +最后看下升级后的集群: +```shell +$ kubectl get nodes +NAME STATUS ROLES AGE VERSION +ydzs-master Ready master 64d v1.11.0 +ydzs-node1 Ready 64d v1.11.0 +ydzs-node2 Ready 64d v1.11.0 +``` + +到这里我们的集群就升级成功了,我们可以用同样的方法将集群升级到 v1.12.x、v1.13.x、v1.14.x 版本,而且升级过程中是不会影响到现有业务的。 + -- GitLab