在CentOs上使用Kubeadm安装Kubernetes集群

使用 Kubeadm 在CentOs上安装 Kubernetes 集群

环境准备

  • 环境: CentOs7.3
  • 运行机器IP
    • 192.168.2.225(Master)
    • 192.168.2.224(Node)
    • 192.168.2.223(Node)

使用 DockerMachine 安装 Docker

详见 DockerMachine的安装与使用

  • 安装 DockerMachine

    1
    wget -qO- https://blog.yumc.pw/attachment/script/shell/docker/machine.sh | bash
  • SSH配置密钥 保证三台机器可以SSH互通(不会请自行百度)

  • 防火墙开启端口(每台机器都需要开启 或者直接关闭防火墙)

    1
    2
    3
    4
    5
    6
    7
    8
    firewall-cmd --zone=public --add-port=2376/tcp --permanent
    firewall-cmd --zone=public --add-port=2377/tcp --permanent
    firewall-cmd --zone=public --add-port=7946/tcp --permanent
    firewall-cmd --zone=public --add-port=7946/udp --permanent
    firewall-cmd --zone=public --add-port=4789/udp --permanent
    firewall-cmd --zone=public --add-port=6443/tcp --permanent
    firewall-cmd --zone=public --add-port=10250/tcp --permanent
    firewall-cmd --reload
  • 关闭Swap(所有节点都需要关闭)

    1
    swapoff -a
  • 同时需要修改 /etc/fstab 注释 Swap 相关的条目

    1
    sed -i '/swap/s/^/#/' /etc/fstab
  • 开启IP转发 不然会导致容器无法访问网络

    1
    2
    3
    4
    echo net.ipv4.ip_forward=1 >> /etc/sysctl.conf
    echo net.bridge.bridge-nf-call-iptables=1 >> /etc/sysctl.conf
    echo net.bridge.bridge-nf-call-ip6tables=1 >> /etc/sysctl.conf
    sysctl -p
  • 创建 Machine

    1
    2
    3
    docker-machine create --driver generic --generic-ip-address=192.168.2.225 2-225
    docker-machine create --driver generic --generic-ip-address=192.168.2.224 2-224
    docker-machine create --driver generic --generic-ip-address=192.168.2.223 2-223

安装 Kubeadm

  • 使用SSh链接到2-225

    1
    docker-machine ssh 225
  • 添加 Kubernetes

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    cat > /etc/yum.repos.d/kubernetes.repo<<EOF
    [kubernetes]
    name=Kubernetes
    baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
    enabled=1
    gpgcheck=0
    repo_gpgcheck=0
    gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
    http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF
  • 安装 Kubeadm

    1
    yum install -y -q kubeadm

手动拉取镜像

  • 从阿里云镜像拉取
    1
    curl -qs https://raw.githubusercontent.com/502647092/k8s/master/pull_from_aliyun.sh | bash

使用 Kubeadm 搭建 Kubernetes 集群

  • 旧版本初始化配置(v1.11.3之前的版本)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    apiVersion: kubeadm.k8s.io/v1alpha2
    kind: MasterConfiguration
    kubernetesVersion: v1.11.3
    #用于配置API服务地址 多个IP的时候需要用到
    controlPlaneEndpoint: '192.168.2.101'
    #用于新增API服务器域名
    #apiServerCertSANs:
    #- '192.168.2.101'
    apiServerExtraArgs:
    runtime-config: api/all=true
    controllerManagerExtraArgs:
    horizontal-pod-autoscaler-sync-period: 10s
    horizontal-pod-autoscaler-use-rest-clients: "true"
    node-monitor-grace-period: 10s
  • 创建初始化配置(当前发布时的最新版本)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat > kubeadm.yml<<EOF
    apiVersion: kubeadm.k8s.io/v1alpha3
    kind: ClusterConfiguration
    imageRepository: k8s.gcr.io
    kubernetesVersion: v1.12.1
    #用于配置API服务地址 多个IP的时候需要用到
    controlPlaneEndpoint: '192.168.2.101'
    #用于新增API服务器域名
    #apiServerCertSANs:
    #- '192.168.2.101'
    apiServerExtraArgs:
    runtime-config: api/all=true
    controllerManagerExtraArgs:
    horizontal-pod-autoscaler-sync-period: 10s
    horizontal-pod-autoscaler-use-rest-clients: "true"
    node-monitor-grace-period: 10s
    EOF
  • 初始化 Master 节点

1
kubeadm init --config kubeadm.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
[root@2-225 ~]$ kubeadm init --config kubeadm.yml
[init] using Kubernetes version: v1.11.3
[preflight] running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
I0918 16:14:31.254349 1214 kernel_validator.go:81] Validating kernel version
I0918 16:14:31.254891 1214 kernel_validator.go:96] Validating kernel config
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [2-225 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.2.225]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [2-225 localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [2-225 localhost] and IPs [192.168.2.225 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 59.001744 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.11" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node 2-225 as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node 2-225 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "2-225" as an annotation
[bootstraptoken] using token: 8qebd3.g81fncibeelo542a
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

kubeadm join 192.168.2.225:6443 --token 8qebd3.g81fncibeelo542a --discovery-token-ca-cert-hash sha256:d15f7dc588ec3b14b076837657c8dc3af0759eb65beac919984cb134a912ab09
  • 在子节点上执行刚才 init 之后最后出现的命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[root@2-223 ~]$ kubeadm join 192.168.2.225:6443 --token 8qebd3.g81fncibeelo542a --discovery-token-ca-cert-hash sha256:d15f7dc588ec3b14b076837657c8dc3af0759eb65beac919984cb134a912ab09
[preflight] running pre-flight checks
[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{}]
you can solve this problem with following methods:
1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support

I0918 16:28:35.843777 6155 kernel_validator.go:81] Validating kernel version
I0918 16:28:35.843964 6155 kernel_validator.go:96] Validating kernel config
[discovery] Trying to connect to API Server "192.168.2.225:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.2.225:6443"
[discovery] Requesting info from "https://192.168.2.225:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.2.225:6443"
[discovery] Successfully established connection with API Server "192.168.2.225:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "2-223" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.
  • 如果遇到下列错误 证书错误问题 请同步两台服务器的时间
1
2
3
4
5
6
[root@k8s-node1 ~]$ kubeadm join 192.168.10.44:6443 --token 4ihi3s.jx91ma431zq8cjmj --discovery-token-ca-cert-hash sha256:63c0fd0455e216bd4128a91ecf1400db4b17fb846ccc39814d43aa5e3be015f3                                                      
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "192.168.10.44:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.10.44:6443"
[discovery] Requesting info from "https://192.168.10.44:6443" again to validate TLS against the pinned public key
[discovery] Failed to request cluster info, will try again: [Get https://192.168.10.44:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: x509: certificate has expired or is not yet valid]
  • 如果遇到下列错误 那就是版本不匹配 请保证两个服务器的版本相同
1
2
3
4
5
6
7
8
9
[root@k8s-node1 ~]$ kubeadm join 192.168.10.44:6443 --token 30z69g.k420edr94ms96hmv --discovery-token-ca-cert-hash sha256:ee5dbd2a205a775f8cb26f10ead191cb91648dabec0d742b68a1473757146995
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "192.168.10.44:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.10.44:6443"
[discovery] Requesting info from "https://192.168.10.44:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.10.44:6443"
[discovery] Successfully established connection with API Server "192.168.10.44:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
configmaps "kubelet-config-1.11" is forbidden: User "system:bootstrap:30z69g" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
  • 主节点配置 kubectl
1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
  • 查看节点状态
1
kubectl get nodes,po,svc --all-namespaces
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@2-225 k8s]$ kubectl get nodes,po,svc --all-namespaces
NAME STATUS ROLES AGE VERSION
node/2-225 NotReady master 59s v1.12.1
node/2-224 NotReady <none> 44s v1.12.1
node/2-223 NotReady <none> 42s v1.12.1

NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-576cbf47c7-hxwpt 0/1 ContainerCreating 0 49s
kube-system pod/coredns-576cbf47c7-jrf2l 0/1 ContainerCreating 0 49s
kube-system pod/kube-proxy-cl2dv 1/1 Running 0 49s
kube-system pod/kube-proxy-f24jq 1/1 Running 0 44s
kube-system pod/kube-proxy-vtcnl 1/1 Running 0 42s

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 59s
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 55s
  • 这里显示节点都是 NotReady 原因可以用 kubectl describe node <节点状态> 看到
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
[root@2-225 ~]$ kubectl describe node 2-225                                                                                                          
Name: 2-225
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=2-225
node-role.kubernetes.io/master=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket=/var/run/dockershim.sock
node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp: Tue, 18 Sep 2018 16:15:29 +0800
Taints: node-role.kubernetes.io/master:NoSchedule
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Tue, 18 Sep 2018 16:36:41 +0800 Tue, 18 Sep 2018 16:15:23 +0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 18 Sep 2018 16:36:41 +0800 Tue, 18 Sep 2018 16:15:23 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 18 Sep 2018 16:36:41 +0800 Tue, 18 Sep 2018 16:15:23 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 18 Sep 2018 16:36:41 +0800 Tue, 18 Sep 2018 16:15:23 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready False Tue, 18 Sep 2018 16:36:41 +0800 Tue, 18 Sep 2018 16:15:23 +0800 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 192.168.2.225
Hostname: 2-225
Capacity:
cpu: 4
ephemeral-storage: 51175Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16165236Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 48294789041
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16062836Ki
pods: 110
System Info:
Machine ID: af120467c8e449fea8c3583af852ea00
System UUID: 031B021C-040D-0502-1A06-F50700080009
Boot ID: 5edcacee-8596-4328-85ee-398422bc6680
Kernel Version: 3.10.0-862.11.6.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://18.6.1
Kubelet Version: v1.11.3
Kube-Proxy Version: v1.11.3
Non-terminated Pods: (5 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system etcd-2-225 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-apiserver-2-225 250m (6%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-controller-manager-2-225 200m (5%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-proxy-bgtl4 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-scheduler-2-225 100m (2%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 550m (13%) 0 (0%)
memory 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 22m kubelet, 2-225 Starting kubelet.
Normal NodeAllocatableEnforced 22m kubelet, 2-225 Updated Node Allocatable limit across pods
Normal NodeHasSufficientPID 22m (x5 over 22m) kubelet, 2-225 Node 2-225 status is now: NodeHasSufficientPID
Normal NodeHasSufficientDisk 22m (x6 over 22m) kubelet, 2-225 Node 2-225 status is now: NodeHasSufficientDisk
Normal NodeHasSufficientMemory 22m (x6 over 22m) kubelet, 2-225 Node 2-225 status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 22m (x6 over 22m) kubelet, 2-225 Node 2-225 status is now: NodeHasNoDiskPressure
Normal Starting 20m kube-proxy, 2-225 Starting kube-proxy.
  • 查看 coredns 容器状态 可以在 Event 看到因为网络插件未就绪 所以容器没有启动成功
1
kubectl describe po $(kubectl get po -n=kube-system | grep coredns | tail -n 1 | awk '{print $1}') -n=kube-system
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
[root@2-225 ~]$ kubectl describe po $(kubectl get po -n=kube-system | grep coredns | tail -n 1 | awk '{print $1}') -n=kube-system
Name: coredns-576cbf47c7-jrf2l
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: 2-111/192.168.2.111
Start Time: Thu, 11 Oct 2018 15:22:59 +0800
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID:
Image: k8s.gcr.io/coredns:1.2.2
Image ID:
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-blfck (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-blfck:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-blfck
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m46s default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-jrf2l to 2-111
Warning NetworkNotReady 3s (x23 over 4m46s) kubelet, 2-111 network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]

配置开机自启动

  • 所有节点 设置 kubelet 开机自启动
    1
    systemctl enable kubelet.service

安装扩展插件

安装 CNI 网络插件

  • 安装 Weave 网络插件
1
kubectl apply -f https://git.io/weave-kube-1.6
1
2
3
4
5
6
7
[root@2-225 ~]$ kubectl apply -f https://git.io/weave-kube-1.6
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.extensions/weave-net created

安装 Rook 存储插件

  • 安装存储插件
1
2
3
kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/common.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/operator.yaml
kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/cluster.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
[root@2-225 k8s]$ kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/common.yaml
namespace/rook-ceph created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/volumes.rook.io created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt-rules created
role.rbac.authorization.k8s.io/rook-ceph-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global-rules created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster-rules created
serviceaccount/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
serviceaccount/rook-ceph-cmd-reporter created
role.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system-rules created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
podsecuritypolicy.policy/rook-privileged created
clusterrole.rbac.authorization.k8s.io/psp:rook created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-default-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-provisioner-sa-psp created
serviceaccount/rook-csi-cephfs-plugin-sa created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin-rules created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
serviceaccount/rook-csi-cephfs-provisioner-sa created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner-rules created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role created
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-provisioner-sa-psp created
serviceaccount/rook-csi-rbd-plugin-sa created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin-rules created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
serviceaccount/rook-csi-rbd-provisioner-sa created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner-rules created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
[root@2-225 k8s]$ kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/operator.yaml
deployment.apps/rook-ceph-operator created
[root@2-225 k8s]$ kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/cluster.yaml
cephcluster.ceph.rook.io/rook-ceph created
  • 查看存储插件情况
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
[root@2-225 k8s]$ kubectl describe pods -n rook-ceph-system                                                                                          
Name: rook-ceph-operator-78d498c68c-47xnp
Namespace: rook-ceph-system
Priority: 0
PriorityClassName: <none>
Node: 2-224/192.168.2.224
Start Time: Wed, 19 Sep 2018 19:10:16 +0800
Labels: app=rook-ceph-operator
pod-template-hash=3480547247
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/rook-ceph-operator-78d498c68c
Containers:
rook-ceph-operator:
Container ID:
Image: rook/ceph:master
Image ID:
Port: <none>
Host Port: <none>
Args:
ceph
operator
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
ROOK_ALLOW_MULTIPLE_FILESYSTEMS: false
ROOK_LOG_LEVEL: INFO
ROOK_MON_HEALTHCHECK_INTERVAL: 45s
ROOK_MON_OUT_TIMEOUT: 300s
ROOK_HOSTPATH_REQUIRES_PRIVILEGED: false
NODE_NAME: (v1:spec.nodeName)
POD_NAME: rook-ceph-operator-78d498c68c-47xnp (v1:metadata.name)
POD_NAMESPACE: rook-ceph-system (v1:metadata.namespace)
Mounts:
/etc/ceph from default-config-dir (rw)
/var/lib/rook from rook-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from rook-ceph-system-token-s22w5 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
rook-config:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-config-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
rook-ceph-system-token-s22w5:
Type: Secret (a volume populated by a Secret)
SecretName: rook-ceph-system-token-s22w5
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned rook-ceph-system/rook-ceph-operator-78d498c68c-47xnp to 2-224
Normal Pulling 1m kubelet, 2-224 pulling image "rook/ceph:master"
  • 创建 PV
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cat > rook-storage.yaml<<EOF
apiVersion: ceph.rook.io/v1beta1
kind: Pool
metadata:
name: replicapool
namespace: rook-ceph
spec:
replicated:
size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
pool: replicapool
clusterNamespace: rook-ceph
EOF
kubectl apply -f rook-storage.yaml

安装 Dashboard 控制面板

如果点登陆没反应 记得注意官方文档这句话 也就是登录到 Dashboard 必须满足 HTTPS访问 或者 HTTP+localhost 访问 不然点击登陆按钮不会有任何反应 详见 ISSUES

NOTE: Dashboard should not be exposed publicly using kubectl proxy command as it only allows HTTP connection. For domains other than localhost and 127.0.0.1 it will not be possible to sign in. Nothing will happen after clicking Sign in button on login page.

安装面板的命令

1
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

创建访问 Dashboard 的账户

这里推荐使用第二种方式

用Dashboard自带的角色添加权限
  • kubernetes-dashboardServiceAccount 绑定权限

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    cat > dashboard-admin.yaml<<EOF
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
    name: kubernetes-dashboard
    labels:
    k8s-app: kubernetes-dashboard
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: cluster-admin
    subjects:
    - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kube-system
    EOF
    kubectl apply -f dashboard-admin.yaml
  • 此账户获取Token的方式

    1
    kubectl describe secrets $(kubectl get secrets --namespace kube-system | grep dashboard-token | awk '{print $1}') --namespace kube-system | grep token: | awk '{print $2}'
新建一个管理员
  • 安全一点的方式是新建一个账户 赋予权限

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    cat > admin-role.yaml<<EOF
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1beta1
    metadata:
    name: admin
    annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
    roleRef:
    kind: ClusterRole
    name: cluster-admin
    apiGroup: rbac.authorization.k8s.io
    subjects:
    - kind: ServiceAccount
    name: admin
    namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
    name: admin
    namespace: kube-system
    labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    EOF
    kubectl apply -f admin-role.yaml
  • 此账户获取Token的方式

    1
    kubectl describe secrets $(kubectl get secrets --namespace kube-system | grep admin-token | awk '{print $1}') --namespace kube-system | grep token: | awk '{print $2}'

访问面板

通过 Proxy 方式访问
  • 第一种通过 kubectl proxy 暴露API (这种方法只能本地访问)

    • 开启代理

      1
      kubectl proxy &
    • 然后访问 http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login

通过新建服务 对外暴露端口
  • 通过 NodeIP+NodePort 访问 此方法可以任意访问 但是存在证书问题 忽略即可

    • 修改 kubernetes-dashboard.yaml 文件

      1
      2
      wget -Okubernetes-dashboard.yaml https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
      vi kubernetes-dashboard.yaml
    • 拉到底 找到 Service 区域 spec 改为 NodePort

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      kind: Service
      apiVersion: v1
      metadata:
      labels:
      k8s-app: kubernetes-dashboard
      name: kubernetes-dashboard
      namespace: kube-system
      spec:
      type: NodePort
      ports:
      - port: 8443
      targetPort: 8443
      nodePort: 30443
      selector:
      k8s-app: kubernetes-dashboard
    • 重新部署 使其生效

      1
      kubectl apply -f kubernetes-dashboard.yaml
    • 访问 https://NodeIP:30443 如果提示 NET::ERR_CERT_INVALID 点击继续访问

使用 Token 登录面板

  • 打开 Dashboard 之后 点击 Token 登录 输入上面获取到的 Token 登录即可

错误处理

API服务器连接错误导致的无法启动
  • 启动失败 由于无法链接API服务器
1
kubectl logs $(kubectl get po -n=kube-system | grep dashboard | tail -n 1 | awk '{print $1}') -n=kube-system
1
2
3
4
5
6
7
[root@local k8s]$ kubectl logs $(kubectl get po -n=kube-system | grep dashboard | tail -n 1 | awk '{print $1}') -n=kube-system
2018/10/02 08:58:15 Starting overwatch
2018/10/02 08:58:15 Using in-cluster config to connect to apiserver
2018/10/02 08:58:15 Using service account token for csrf signing
2018/10/02 08:58:15 No request provided. Skipping authorization
2018/10/02 08:58:16 Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service account's configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: getsockopt: no route to host
Refer to our FAQ and wiki pages for more information: https://github.com/kubernetes/dashboard/wiki/FAQ
  • 处理方案

    • 正常情况下 kube-system 的状态结果

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      [root@localhost k8s]$ kubectl get nodes,po,svc -n=kube-system
      NAME STATUS ROLES AGE VERSION
      node/2-111 Ready master 9m51s v1.12.1
      node/2-112 Ready <none> 9m33s v1.12.1
      node/2-113 Ready <none> 9m31s v1.12.1
      node/2-114 Ready <none> 9m29s v1.12.1

      NAME READY STATUS RESTARTS AGE
      pod/coredns-576cbf47c7-4mng5 1/1 Running 0 9m41s
      pod/coredns-576cbf47c7-nm46w 1/1 Running 0 9m41s
      pod/etcd-2-111 1/1 Running 0 8m53s
      pod/kube-apiserver-2-111 1/1 Running 0 8m41s
      pod/kube-controller-manager-2-111 1/1 Running 0 8m57s
      pod/kube-proxy-29c28 1/1 Running 0 9m33s
      pod/kube-proxy-555dm 1/1 Running 0 9m29s
      pod/kube-proxy-nl4c7 1/1 Running 0 9m31s
      pod/kube-proxy-qpq99 1/1 Running 0 9m41s
      pod/kube-scheduler-2-111 1/1 Running 0 8m32s
      pod/kubernetes-dashboard-77fd78f978-bh6x7 1/1 Running 0 4m18s
      pod/weave-net-7h6p2 2/2 Running 0 7m20s
      pod/weave-net-jg8nk 2/2 Running 0 7m20s
      pod/weave-net-jj88s 2/2 Running 0 7m20s
      pod/weave-net-kff86 2/2 Running 0 7m20s

      NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
      service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 9m47s
      service/kubernetes-dashboard NodePort 10.105.50.231 <none> 443:30443/TCP 4m18s
    • 第一步 检查 coredns 是否启动完成 如果没有 重新配置

    • 第二步 检查 weave CNI 网络插件是否启动完成 如果没有 重新配置
    • 如果上述方案都不行 那就自行 Google

安装 Heapster 监控

1
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/standalone/heapster-controller.yaml

完全删除安装(会删除所有包括Docker)

1
2
3
4
5
6
7
kubeadm reset -f
yum remove docker* kubeadm kubectl kubelet -y
ip link delete docker0
ip link delete cni0
ip link delete weave
ip link delete flannel.1
rpm -e $(rpm -qa | grep docker)

相关问题处理

机器重启后 Kubernetes 集群没有自动启动

  • 查看错误 发现是 swap 没有关闭导致的
1
2
3
4
5
6
7
8
[root@2-225 ~]$ kubelet
I1010 14:52:48.559293 6356 server.go:408] Version: v1.11.3
I1010 14:52:48.560087 6356 plugins.go:97] No cloud provider specified.
W1010 14:52:48.560126 6356 server.go:549] standalone mode, no API client
W1010 14:52:48.668530 6356 server.go:465] No api server defined - no events will be sent to API server.
I1010 14:52:48.668552 6356 server.go:648] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
F1010 14:52:48.668740 6356 server.go:262] failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on
flag to false. /proc/swaps contained: [Filename Type Size Used Priority /dev/dm-1 partition 8191996 0 -1]
  • 修改 /etc/fstab 注释类型为 swap 的条目
1
2
3
4
5
6
7
8
9
10
11
12
#
# /etc/fstab
# Created by anaconda on Wed Aug 22 15:10:41 2018
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root / xfs defaults 0 0
UUID=46ae95c4-220d-46a9-9691-2cf76452ecc8 /boot xfs defaults 0 0
UUID=C707-FC17 /boot/efi vfat umask=0077,shortname=winnt 0 0
/dev/mapper/centos-home /home xfs defaults 0 0
#/dev/mapper/centos-swap swap swap defaults 0 0