k8s已经成为业界容器编排技术的平台标准,本文介绍了在单机上如何部署一个k8s集群,同时承担master和worker节点角色,采用flannel网络插件搭建其底层网络模型,部署完毕之后运行一个简单nginx服务。通过部署单机k8s集群,我们可以快速进行相关k8s集群的测试、调试和学习。
本文所安装k8s版本为1.20.0。
k8s集群的单机部署主要有如下步骤,
- 检查和配置环境,使之符合k8s所要求。
- 安装k8s所需的容器运行时,本文选用了Docker。
- 安装k8s集群管理工具kubeadm/kubelet/kubectl。
- 通过kubeadm初始化k8s集群。
- 通过kubectl部署网络插件flannel。
k8s集群的安装过程不是特别复杂,但是由于国内墙的原因,在获取一些相关容器镜像时,会遇到困难,这主要影响第4步和第5步的k8s集群初始化。本文安装过程中所使用到的镜像列表将在文末给出,供读者参考。博主本人一般先尝试通过阿里云等国内镜像源服务拉取相关镜像,若还是无法拉取,则借助代理服务访问外网镜像源。
在无镜像拉取问题的情况下,整个部署过程预计在1个小时内可以完成。安装过程中可能遇到的一些常见问题和解决方案,在文末将一一列出。
1. 检查当前环境,按要求初始化环境
1.1 安装要求
安装k8s集群的基本要求如下,
- 至少2核CPU + 2G内存
- 操作系统版本必须符合如下要求
- Ubuntu 16.04+
- Debian 9+
- CentOS 7+
- Red Hat Enterprise Linux (RHEL) 7+
- Fedora 25+
- HypriotOS v1.0.1+
- Flatcar Container Linux (使用 2512.3.0 版本测试通过)
- 集群中的所有机器的网络彼此均能相互连接
- 节点之中不可以有重复的主机名、MAC 地址或 product_uuid。
- 查看k8s所需端口,确保这些端口未被防火墙拦截,并检查所需端口在主机上没有被占用。
- 禁用交换分区。
详细的安装要求见这里。
本文安装所使用到的主机配置为,
* 2核CPU + 4G内存
* 操作系统为Ubuntu 20.10,Linux 5.8.0-41-generic
* 单机版网络,单以太网卡接入局域网,未开启防火墙
1.2 查看系统CPU和内存情况
# 查看系统CPU
$ cat /proc/cpuinfo
# 查看系统memory
$ cat /proc/meminfo
1.3 查看系统版本
$ uname -a
Linux k8s-master01 5.8.0-41-generic #46-Ubuntu SMP Mon Jan 18 16:48:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.10
DISTRIB_CODENAME=groovy
DISTRIB_DESCRIPTION="Ubuntu 20.10"
1.4 查看系统Mac和product_uuid
查看系统Mac和product_uuid,在正式环境中,必须保证每个主机节点唯一。
# 查看当前主机的网络适配器和Mac地址
$ ifconfig -a
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.2.15 netmask 255.255.255.0 broadcast 10.0.2.255
inet6 fe80::7f65:6807:9e8e:8ac3 prefixlen 64 scopeid 0x20<link>
ether 08:00:27:52:8d:82 txqueuelen 1000 (Ethernet)
RX packets 84 bytes 35282 (35.2 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 117 bytes 17453 (17.4 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 150 bytes 12646 (12.6 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 150 bytes 12646 (12.6 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
# 查看enp0s3的mac地址
$ cat /sys/class/net/enp0s3/address
08:00:27:52:8d:82
# 查看当前主机product_uuid
$ sudo cat /sys/class/dmi/id/product_uuid
975ca0e1-d319-3745-9915-a61ef4297ddf
1.5 查看网卡适配器和路由
查看路由命令,当前主机只有一个网卡适配器,缺省路由0.0.0.0正常指向网关地址。
$ route -v
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default _gateway 0.0.0.0 UG 100 0 0 enp0s3
10.0.2.0 0.0.0.0 255.255.255.0 U 100 0 0 enp0s3
link-local 0.0.0.0 255.255.0.0 U 1000 0 0 enp0s3
若主机上有多个网卡适配器,则需要确认当前k8s组件能够通过缺省路由访问到正确的目标网卡适配器。
1.6 查看netfilter组件
netfilter是一个运行在Linux内核空间、实现网络流量包过滤/地址转换的框架。iptables是一个工作在应用空间、管理netfilter流量规则的应用程序,其通过netfilter回调hook,将网络流量包分类映射到相应的流量规则集合,实现流量管控。
由于k8s集群中容器的IP地址是动态分配的,k8s通过iptables/netfilter控制应用服务流量数据包,实现集群内应用服务流量的动态负载均衡。
如下命令可以加载netfilter组件并允许iptables过滤网桥流量。
# 加载netfilter
$ modprobe br_netfilter
# 确认netfilter的加载情况,若能看到如下的命令输出,则说明netfilter已被加载
$ lsmod | grep br_netfilter
br_netfilter 28672 0
bridge 200704 1 br_netfilter
# 允许iptables过滤网桥流量
$ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
$ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
$ sudo sysctl --system
1.7 关闭swap
k8s不支持Linux swap功能,这是由于出于集群性能和稳定性考虑,但未来这个问题会被得到解决,可能在1.22这个版本开启swap支持,更多关于这个问题的详细讨论请见:Kubelet/Kubernetes should work with Swap Enabled
关闭swap命令,
# 查看内存中的swap分配情况
$ free -m
total used free shared buff/cache available
Mem: 3932 854 457 15 2620 2783
Swap: 2047 0 2047
# 临时关闭swap
$ sudo swapoff -a
# 查看内存中的swap分配为0
$ free -m
total used free shared buff/cache available
Mem: 3932 1265 1074 12 1592 2499
Swap: 0 0 0
# 永久关闭swap(请使用root用户运行如下命令)
$ echo "vm.swappiness=0" >> /etc/sysctl.conf
$ sysctl -p /etc/sysctl.conf
1.8 检查所需端口和主机配置
k8s在master节点和worker节点有不同的端口需求,
master节点(控制平面),
- TCP 6443,Kubernetes API 服务器
- TCP 2379-2380,etcd 服务器客户端 API
- TCP 10250,Kubelet API
- TCP 10251,kube-scheduler
- TCP 10252,kube-controller-manager
worker节点(工作平面),
- TCP 10250,Kubelet API
- TCP 30000-32767,NodePort 服务
在搭建过程中需要确认上述端口没有被占用,并且没有被防火墙拦截。
# 为了方便演示快速部署,建议直接关闭防火墙(若有的话)。
$ systemctl stop firewalld && systemctl disable firewalld
1.9 其它
为了方便安装和演示,建议打开如下三个shell窗口,
- 第一个shell窗口:切换到root账号下,方便运行需要root权限的系统命令,包括工具安装命令、执行kubeadm命令实现k8s集群的启动和管理等。
- 第二个shell窗口:查看后台系统日志,执行命令tail -f /var/log/syslog,可以在这里查看到k8s集群启动和运行相关信息。
- 第三个shell窗口:通过kubectl命令,和k8s控制平面沟通,在k8s集群中运行容器服务。
2. 安装docker作为k8s的容器运行时
k8s可以支持不同的容器运行时,常见的有,
- containerd
- CRI-O
- Docker
本文采用docker作为k8s的容器运行时,如下安装步骤参考了官网资料,但有略微调整,
# 安装curl工具,若已安装可以跳过
$ sudo apt install curl
# Install packages to allow apt to use a repository over HTTPS
$ sudo apt-get update && sudo apt-get install -y \
apt-transport-https ca-certificates curl software-properties-common gnupg2
# 添加docker apt repository
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key --keyring /etc/apt/trusted.gpg.d/docker.gpg add -
$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
# 安装docker
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli
# 安装完毕之后可以在如下路径查看到docker container runtime,这是k8s的缺省搜索路径
# https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
# kubeadm automatically tries to detect an installed container runtime by scanning through a list of well known Unix domain sockets.
$ ll /var/run/docker.sock
# 配置 Docker daemon,设置cgroupdriver为systemd
#
sudo mkdir /etc/docker
$ cat <<EOF | sudo tee /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
# 重启 docker 后台服务
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
3. 安装k8s启动管理工具kubelet/kubeadm/kubectl
k8s提供了三个管理工具,用于k8s集群的启动和管理,
- kubeadm:一个k8s管理工具,通过kubeadm init命令初始化master节点,通过kubeadm join命令部署worker节点。
- kubelet:一个k8s集群组件,接受k8s控制平面指令,启动和管理当前节点上的k8s pods和containers。
- kubectl:一个k8s命令行工具,方便和k8s控制平面进行交互,下发指令和查看集群状态。
这三个工具是搭建k8s集群的必要软件,如下为这三个工具软件的安装步骤,
# 添加k8s apt国内源(请使用root用户运行如下命令)
$ curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
$ apt-get update
# 安装kubelet/kubeadm/kubectl
$ apt-get install -y kubelet kubeadm kubectl
# 启动kubelet
$ sudo systemctl daemon-reload
$ sudo systemctl restart kubelet
# 查看kubelet日志
$ tail -f /var/log/syslog
注:若在/var/log/syslog的日志中看到如下kubelet服务启动失败信息,可以先不用担心,在k8s master节点初始化之前,出现如下信息是正常的,kubelet会定时尝试连接k8s api server,直到成功。
kubelet[18544]: #011/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/go.opencensus.io/stats/view/worker.go:32 +0x57
systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
systemd[1]: kubelet.service: Failed with result 'exit-code'.
4. 启动k8s集群
下面将使用kubeadm工具启动k8s集群,启动命令如下,
kubeadm init <args>
在运行如上命令之前,建议做两个事情,
- 运行kubeadm config images pull查看所需的镜像列表,由于gcr.io在国内被墙的原因,建议把这些镜像通过国内阿里云镜像源先拉取下来,再通过docker tag命令标记到gcr.io本地仓库下。详细步骤见4.1。
- 考虑k8s service和pod的网段划分,避免和主机节点的网段冲突,本文在安装过程中设计的三个网段划分如下,
- 主机节点网段:10.0.2.0/8
- k8s service网段:10.1.0.0/16
- k8s pod网段:10.244.0.0/16
4.1 获取k8s集群所需镜像
如下脚本可以方便获取启动k8s集群所需的镜像,
# kube-adm-images.sh
# 如下镜像列表和版本,请运行kubeadm config images list命令获取
#
images=(
kube-apiserver:v1.20.2
kube-controller-manager:v1.20.2
kube-scheduler:v1.20.2
kube-proxy:v1.20.2
pause:3.2
etcd:3.4.13-0
coredns:1.7.0
)
for imageName in ${images[@]} ; do
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName
done
如下为本地拉取的镜像列表,
# 获取k8s集群控制平面所需的镜像
$ kubeadm config images list
19385 version.go:101] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://storage.googleapis.com/kubernetes-release/release/stable-1.txt": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
19385 version.go:102] falling back to the local client version: v1.20.0
k8s.gcr.io/kube-apiserver:v1.20.0
k8s.gcr.io/kube-controller-manager:v1.20.0
k8s.gcr.io/kube-scheduler:v1.20.0
k8s.gcr.io/kube-proxy:v1.20.0
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns:1.7.0
# 查看本地镜像
$ docker images list
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.20.0 10cc881966cf 8 weeks ago 118MB
k8s.gcr.io/kube-controller-manager v1.20.0 b9fa1895dcaa 8 weeks ago 116MB
k8s.gcr.io/kube-scheduler v1.20.0 3138b6e3d471 8 weeks ago 46.4MB
k8s.gcr.io/kube-apiserver v1.20.0 ca9843d3b545 8 weeks ago 122MB
k8s.gcr.io/etcd 3.4.13-0 0369cf4303ff 5 months ago 253MB
k8s.gcr.io/coredns 1.7.0 bfe3a36ebd25 7 months ago 45.2MB
k8s.gcr.io/pause 3.2 80d28bedfe5d 11 months ago 683kB
4.2 运行kubeadm init命令
接下来就是运行kubeadm init命令来启动k8s集群,注意启动命令中的service-cidr和pod-network-cidr参数设置。
# 主机节点网段:10.0.2.0/8
# k8s service网段:10.1.0.0/16
# k8s pod网段:10.244.0.0/16
# 启动k8s集群
$ sudo kubeadm init \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.20.2 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16
# 如下为完整运行日志
$ sudo kubeadm init \
> --image-repository registry.aliyuncs.com/google_containers \
> --kubernetes-version v1.20.2 \
> --service-cidr=10.1.0.0/16 \
> --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.20.2
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.3. Latest validated version: 19.03
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local michaelk8s-virtualbox] and IPs [10.1.0.1 10.0.2.15]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost michaelk8s-virtualbox] and IPs [10.0.2.15 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost michaelk8s-virtualbox] and IPs [10.0.2.15 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 15.002558 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node michaelk8s-virtualbox as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node michaelk8s-virtualbox as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: xxx.xxxx
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.2.15:6443 --token xxx.xxx --discovery-token-ca-cert-hash sha256:xxxx
在k8s集群启动之后,可以通过kubectl get pods -A命令查看集群服务列表,
# 配置环境变量KUBECONFIG
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看k8s集群状态
$ kubectl cluster-info
Kubernetes control plane is running at https://10.0.2.15:6443
KubeDNS is running at https://10.0.2.15:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
# k8s集群控制平面服务列表,
# 1. k8s api server
# 2. k8s controller manager
# 3. k8s scheduler
# 4. k8s proxy
# 5. etcd
# 6. coredns
# 查看控制平面的服务列表
$ kubectl get pods -n kube-system
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7f89b7bc75-685w4 0/1 Pending 0 3m47s
kube-system coredns-7f89b7bc75-tcp2b 0/1 Pending 0 3m47s
kube-system etcd-michaelk8s-virtualbox 1/1 Running 0 3m53s
kube-system kube-apiserver-michaelk8s-virtualbox 1/1 Running 0 3m53s
kube-system kube-controller-manager-michaelk8s-virtualbox 1/1 Running 0 3m53s
kube-system kube-proxy-6pvcb 1/1 Running 0 3m47s
kube-system kube-scheduler-michaelk8s-virtualbox 1/1 Running 0 3m53s
# 查看集群初始化配置,在正式环境的搭建过程中,可以通过kubeadm config的方式来初始化master/worker节点
$ sudo kubeadm config print init-defaults
$ sudo kubeadm config print join-defaults
这个时候查看kubelet的状态,可以看到已经正常启动。
# 查看kubelet的状态
$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running)
Docs: https://kubernetes.io/docs/home/
Main PID: 97961 (kubelet)
Tasks: 17 (limit: 4650)
Memory: 68.7M
CGroup: /system.slice/kubelet.service
└─97961 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubel>
但是若查看/var/log/syslog日志,仍然可以看到如下cni config uninitialized的异常信息,
$ tail -f /var/log/syslog
kubelet[22127]: 22127 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
kubelet[22127]: 22127 kubelet.go:2163] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
其告知cni网络插件未配置,这是下一步安装工作需要做的事情。
5. 安装k8s网络插件Flannel
Flannel是一个k8s cni插件,在L3层可以通过VXLAN+UDP技术建立了一个覆盖网络。其安装步骤非常简单,直接运行kubectl apply命令即可,但是由于墙的原因,建议先把如下的镜像下载到本地。
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/coreos/flannel v0.13.0 e708f4bb69e3 3 months ago 57.2MB
然后执行如下安装命令,
# 下载kube-flannel.yml文件,部署flannel网络插件
# 文件地址:https://gitee.com/pphh/blog/blob/master/210215_k8s_deployment/kube-flannel.yml
$ kubectl apply -f kube-flannel.yml
# 安装完毕之后,检查如下cni配置文件,若能找到则说明安装完成
$ cat /etc/cni/net.d/10-flannel.conflist
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
# 查看flannel服务
$ kubectl get pods -A | grep flannel
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-flannel-ds-2ntmg 1/1 Running 0 54m
# 查看日志,
$ tail -f /var/log/syslog
Joining mDNS multicast group on interface flannel.1.IPv6 with address fe80::801e:edff:fece:975c.
New relevant interface flannel.1.IPv6 for mDNS.
Registering new address record for fe80::801e:edff:fece:975c on flannel.1.*.
6. 启动一个nginx服务
6.1 设置master节点可以运行pod
在默认情况下,出于安全原因,在master节点上不允许运行调度pod。在本例子中,为了方便部署可运行的单节点k8s集群,将关闭这个限制。
$ kubectl taint nodes --all node-role.kubernetes.io/master-
node/michaelk8s-virtualbox untainted
6.2 启动一个nginx演示程序
# 下载k8s-deployment-nginx.yml文件,启动nginx
# 文件地址:https://gitee.com/pphh/blog/blob/master/210215_k8s_deployment/k8s-deployment-nginx.yml
$ kubectl apply -f k8s-deployment-nginx.yml
# 下载k8s-deployment-nginx-svc.yml文件,启动nginx服务
# 文件地址:https://gitee.com/pphh/blog/blob/master/210215_k8s_deployment/k8s-deployment-nginx-svc.yml
$ kubectl apply -f k8s-deployment-nginx-svc.yml
# 查看nginx状态
$ kubectl get pods -a
$ 查看服务运行情况
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 70m
nginx-service NodePort 10.1.8.186 <none> 8000:32000/TCP 8s
打开浏览器访问如下地址,
- http://10.1.8.186:8000/
可以看到welcome to nginx的欢迎信息,
8. 安装过程中遇到的一些问题
8.1 无法查看docker镜像列表
执行docker images命令时,告知permission denied的异常,
$ docker images
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/json: dial unix /var/run/docker.sock: connect: permission denied
解决方案:切换到root账号下执行docker images命令即可。
8.2 拉取docker镜像问题
本文安装的k8s集群所用到的镜像如下,
$ sudo su -
root@michaelk8s-VirtualBox:~# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginx latest f6d0b4767a6c 3 weeks ago 133MB
k8s.gcr.io/kube-proxy v1.20.0 10cc881966cf 8 weeks ago 118MB
k8s.gcr.io/kube-scheduler v1.20.0 3138b6e3d471 8 weeks ago 46.4MB
k8s.gcr.io/kube-apiserver v1.20.0 ca9843d3b545 8 weeks ago 122MB
k8s.gcr.io/kube-controller-manager v1.20.0 b9fa1895dcaa 8 weeks ago 116MB
quay.io/coreos/flannel v0.13.0 e708f4bb69e3 3 months ago 57.2MB
k8s.gcr.io/etcd 3.4.13-0 0369cf4303ff 5 months ago 253MB
k8s.gcr.io/coredns 1.7.0 bfe3a36ebd25 7 months ago 45.2MB
k8s.gcr.io/pause 3.2 80d28bedfe5d 11 months ago 683kB
nginx latest f6d0b4767a6c 3 weeks ago 133MB
建议先尝试通过阿里云等国内镜像源服务拉取相关镜像,源地址有,
- registry.aliyuncs.com/google_containers/kube-proxy
- registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy
若还是无法拉取,则考虑借助代理服务,访问外网镜像源。
8.3 k8s服务状态Init:ErrImagePull
查看k8s服务,有时候可以看到pod的状态一直为Init:ErrImagePull,这也是由于镜像无法拉取到本地导致,尝试切换镜像源。
# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-flannel-ds-2ntmg 0/1 Init:ErrImagePull 0 4m51s
8.4 k8s集群初始化时报yaml配置文件已经存在
在主机上第二次运行kubeadm init命令时,会告知yaml配置文件已经存在的错误信息。
$ kubeadm init
[init] Using Kubernetes version: v1.20.2
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.3. Latest validated version: 19.03
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Swap]: running with swap on is not supported. Please disable swap
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决方案:运行kubeadm reset重置本地配置,然后再次执行kubeadm init命令。
kubeadm启动k8s集群还有其它的报错信息,比如,
[ERROR Swap]: running with swap on is not supported. Please disable swap.
这个根据日志提示,关闭swap即可。
或者希望跳过该错误提示,可以对kubeadm/kubelet分别做如下配置,
- kubeadm的启动命令中指定--ignore-preflight-errors=Swap
- kubelet的配置文件中添加配置failSwapOn: False
8.5 系统日志报无网络插件Unable to update cni config
在安装了kubelet之后,通过sudo systemctl restart kubelet启动,但是发现服务没有正常启动,查看系统日志,可以看到如下信息,
5644 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
5644 kubelet.go:2163] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
在安装Flannel插件之前,这个是正常报错信息,安装之后该报错信息会消失。
8.6 运行kubectl命令报连接server localhost:8080被拒绝
异常信息如下,
$ kubectl describe pod
The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决方案:这是大多数原因是由于未进行kubectl的环境配置,请按照如下方法二选一进行配置,然后执行kubectl即可解决。
# 方法一:使用配置文件
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 方法二:使用环境变量
$ export KUBECONFIG=/etc/kubernetes/admin.conf
8.7 其它
更多k8s安装问题请见这里。