β

腾讯云环境K8S安装及配置测试

阿辉的博客 84 阅读

1. k8s集群系统规划

1.1. kubernetes 1.10的依赖

k8s V1.10对一些相关的软件包,如etcd,docker并不是全版本支持或全版本测试,建议的版本如下:

参考: External Dependencies
* The supported etcd server version is 3.1.12, as compared to 3.0.17 in v1.9 ( #60988 )
* The validated docker versions are the same as for v1.9: 1.11.2 to 1.13.1 and 17.03.x ( ref )
* The Go version is go1.9.3, as compared to go1.9.2 in v1.9. ( #59012 )
* The minimum supported go is the same as for v1.9: go1.9.1. ( #55301 )
* CNI is the same as v1.9: v0.6.0 ( #51250 )
* CSI is updated to 0.2.0 as compared to 0.1.0 in v1.9. ( #60736 )
* The dashboard add-on has been updated to v1.8.3, as compared to 1.8.0 in v1.9. ( #57326 )
* Heapster has is the same as v1.9: v1.5.0. It will be upgraded in v1.11. ( ref )
* Cluster Autoscaler has been updated to v1.2.0. ( #60842 , @mwielgus )
* Updates kube-dns to v1.14.8 ( #57918 , @rramkumar1 )
* Influxdb is unchanged from v1.9: v1.3.3 ( #53319 )
* Grafana is unchanged from v1.9: v4.4.3 ( #53319 )
* CAdvisor is v0.29.1 ( #60867 )
* fluentd-gcp-scaler is v0.3.0 ( #61269 )
* Updated fluentd in fluentd-es-image to fluentd v1.1.0 ( #58525 , @monotek )
* fluentd-elasticsearch is v2.0.4 ( #58525 )
* Updated fluentd-gcp to v3.0.0. ( #60722 )
* Ingress glbc is v1.0.0 ( #61302 )
* OIDC authentication is coreos/go-oidc v2 ( #58544 )
* Updated fluentd-gcp updated to v2.0.11. ( #56927 , @x13n )
* Calico has been updated to v2.6.7 ( #59130 , @caseydavenport )
*

1.2 测试服务器准备及环境规划

服务器名 IP 功 能 安装服务
sh-saas-cvmk8s-master-01 10.12.96.3 master master,etcd
sh-saas-cvmk8s-master-02 10.12.96.5 master master,etcd
sh-saas-cvmk8s-master-03 10.12.96.13 master master,etcd
sh-saas-cvmk8s-node-01 10.12.96.2 node node
sh-saas-cvmk8s-node-02 10.12.96.4 node node
sh-saas-cvmk8s-node-03 10.12.96.6 node node
bs-ops-test-docker-dev-04 172.21.248.242 私有镜像仓库 harbor
VIP 10.12.96.100 master vip netmask:255.255.255.0

netmask都为:255.255.255.0

所有的测试服务器安装centos linux 7.4最新版本.

VIP:10.12.96.100只是用于keepalived的测试,实际本文使用的是腾讯云LB+haproxy的模式,使用的腾讯云LB VIP为:10.12.16.101

容器网段:10.254.0.0/16
容器网段需要避免这些冲突:

k8s service cluster网络:10.254.255.0/24

1.3 云扩展控制器tencentcloud-cloud-controller-manager

本次测试使用腾讯云主机CVM,网络使用VPC,其中k8s master及node在一个子网:10.12.96.0/24。

tencentcloud-cloud-controller-manager为腾讯云推出的k8s云扩展控制器,网址为: https://github.com/dbdd4us/tencentcloud-cloud-controller-manager

tencentcloud-cloud-controller-manager配置说明参考: https://github.com/dbdd4us/tencentcloud-cloud-controller-manager/blob/master/README_zhCN.md

tencentcloud-cloud-controller-manager在当前版本主要是解决了K8S上的容器可以直接使用VPC网络的问题,使得VPC内其它CVM主机可直接访问K8S内的容器。

tencentcloud-cloud-controller-manager对K8S及相关资源的要求如下:

2. k8s集群安装前的准备工作

2.1 操作系统环境配置

在所有的服务器上操作:

[root@VM_96_3_centos ~]# hostnamectl set-hostname sh-saas-cvmk8s-master-01
[root@VM_96_3_centos ~]# hostnamectl --pretty

[root@VM_96_3_centos ~]# hostnamectl --static
sh-saas-cvmk8s-master-01
[root@VM_96_3_centos ~]# hostnamectl --transient
sh-saas-cvmk8s-master-01
[root@VM_96_3_centos ~]# cat /etc/hostname 
sh-saas-cvmk8s-master-01

以上只修改了一台服务器,在其它服务器上执行相应的操作修改主机名。

systemctl stop firewalld
systemctl disable firewalld
swapoff -a 
sed -i 's/.*swap.*/#&/' /etc/fstab
setenforce  0 
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/sysconfig/selinux 
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config 
sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/sysconfig/selinux 
sed -i "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config  
modprobe br_netfilter
cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf
ls /proc/sys/net/bridge
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y epel-release
yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools wget vim  ntpdate libseccomp libtool-ltdl
yum install -y bash-argsparse bash-completion bash-completion-extras

腾讯云由于原生已带有ntpd服务,此步略过。

systemctl enable ntpdate.service
echo '*/30 * * * * /usr/sbin/ntpdate time7.aliyun.com >/dev/null 2>&1' > /tmp/crontab2.tmp
crontab /tmp/crontab2.tmp
systemctl start ntpdate.service
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536"  >> /etc/security/limits.conf
echo "* hard nproc 65536"  >> /etc/security/limits.conf
echo "* soft  memlock  unlimited"  >> /etc/security/limits.conf
echo "* hard memlock  unlimited"  >> /etc/security/limits.conf

2.2 安装docker

在所有node节点上安装docker17.06,注意:k8s v1.10经过测试的最高docker版本为17.03.x,不建议用其它版本。但是腾讯云需要17.05以上版本,所以我们使用17.06.x的版本。

yum remove docker \
            docker-client \
            docker-client-latest \
            docker-common \
            docker-latest \
            docker-latest-logrotate \
            docker-logrotate \
            docker-selinux \
            docker-engine-selinux \
            docker-engine
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#yum-config-manager --enable docker-ce-edge
#yum-config-manager --enable docker-ce-test
#docker官方YUM安装有冲突,需要加参数:--setopt=obsoletes=0 腾讯云需要17.05以上版本
yum install -y --setopt=obsoletes=0 docker-ce-17.06.2.ce-1.el7.centos docker-ce-selinux-17.06.2.ce-1.el7.centos


vim /usr/lib/systemd/system/docker.service
#找到 ExecStart= 这一行,加上加速地址及私有镜像等,修改如下:
ExecStart=/usr/bin/dockerd  -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --registry-mirror=https://iw5fhz08.mirror.aliyuncs.com  --insecure-registry 172.21.248.242
#需要开放不同node间通信
ExecStartPost=-/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT

#或执行以下命令修改配置文件:
sed -i "s#^ExecStart=/usr/bin/dockerd.*#ExecStart=/usr/bin/dockerd  -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --iptables=false --ip-masq=false --bip=172.17.0.1/16  --registry-mirror=https://iw5fhz08.mirror.aliyuncs.com  --insecure-registry 172.21.248.242#g" /usr/lib/systemd/system/docker.service
sed -i '/ExecStart=/a\ExecStartPost=-/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT' /usr/lib/systemd/system/docker.service

#完整配置文件如下:
[root@10 ~]# cat /usr/lib/systemd/system/docker.service 
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd  -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --iptables=false --ip-masq=false --bip=172.17.0.1/16  --registry-mirror=https://iw5fhz08.mirror.aliyuncs.com  --insecure-registry 172.21.248.242
ExecStartPost=-/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

#重新加载配置
systemctl daemon-reload

#启动Docker
systemctl enable docker
systemctl start docker

2.3 安装harbor私有镜像仓库

以下操作只需要172.21.248.242服务器上执行:

yum -y install docker-compose
wget https://storage.googleapis.com/harbor-releases/release-1.4.0/harbor-offline-installer-v1.4.0.tgz
mkdir /data/
cd /data/
tar xvzf /root/harbor-offline-installer-v1.4.0.tgz
cd harbor
vim harbor.cfg
hostname = 172.21.248.242

sh install.sh

#如果要修改配置:
docker-compose down -v
vim harbor.cfg
prepare
docker-compose up -d

#重启:
docker-compose stop
docker-compose start

用户名:admin
密码:Harbor12345

docker login 172.21.248.242

输入用户名和密码即可登入。

docker pull centos
docker tag centos:latest 172.21.248.242/library/centos:latest
docker push 172.21.248.242/library/centos:latest

在web界面上就可以看到镜像了

docker pull 172.21.248.242/library/centos:latest

3. 安装及配置etcd

3.1 创建etcd证书

以下操作在sh-saas-cvmk8s-master-01上执行即可:

3.1.1 设置cfssl环境

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
chmod +x cfssljson_linux-amd64
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
chmod +x cfssl-certinfo_linux-amd64
mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
export PATH=/usr/local/bin:$PATH

3.1.2 创建 CA 配置文件

下面证书内配置的IP为etcd节点的IP,为方便以后扩容,可考虑多配置几个预留IP地址,如考虑以后需要使用域名,可把域名也配置上去:

mkdir -p /data/k8s/ssl
cd  /data/k8s/ssl
cat > ca-config.json <<EOF
{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "kubernetes": {
        "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ],
        "expiry": "8760h"
      }
    }
  }
}
EOF

cat > ca-csr.json <<EOF
{
  "CN": "kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [{
    "C": "CN",
    "ST": "shanghai",
    "L": "shanghai",
    "O": "k8s",
    "OU": "System"
  }]
}
EOF

cfssl gencert -initca ca-csr.json | cfssljson -bare ca
[root@sh-saas-cvmk8s-master-01 ssl]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
/06/27 15:16:02 [INFO] generating a new CA key and certificate from CSR
/06/27 15:16:02 [INFO] generate received request
/06/27 15:16:02 [INFO] received CSR
/06/27 15:16:02 [INFO] generating key: rsa-2048
/06/27 15:16:03 [INFO] encoded CSR
/06/27 15:16:03 [INFO] signed certificate with serial number 527592143849763980635125886376161678689174991664
[root@sh-saas-cvmk8s-master-01 ssl]# ll
total 20
-rw-r--r-- 1 root root  290 Jun 27 15:14 ca-config.json
-rw-r--r-- 1 root root 1005 Jun 27 15:16 ca.csr
-rw-r--r-- 1 root root  190 Jun 27 15:15 ca-csr.json
-rw------- 1 root root 1679 Jun 27 15:16 ca-key.pem
-rw-r--r-- 1 root root 1363 Jun 27 15:16 ca.pem
[root@sh-saas-cvmk8s-master-01 ssl]# 

cat > etcd-csr.json <<EOF
{
  "CN": "etcd",
  "hosts": [
    "127.0.0.1",
    "10.12.96.3",
    "10.12.96.5",
    "10.12.96.13",
    "10.12.96.100",
    "10.12.96.10",
    "10.12.96.11"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "shanghai",
      "L": "shanghai",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF

cfssl gencert -ca=ca.pem \
  -ca-key=ca-key.pem \
  -config=ca-config.json \
  -profile=kubernetes etcd-csr.json | cfssljson -bare etcd

[root@sh-saas-cvmk8s-master-01 ssl]# ll
total 36
-rw-r--r-- 1 root root  290 Jun 27 15:14 ca-config.json
-rw-r--r-- 1 root root 1005 Jun 27 15:16 ca.csr
-rw-r--r-- 1 root root  190 Jun 27 15:15 ca-csr.json
-rw------- 1 root root 1679 Jun 27 15:16 ca-key.pem
-rw-r--r-- 1 root root 1363 Jun 27 15:16 ca.pem
-rw-r--r-- 1 root root 1135 Jun 27 15:20 etcd.csr
-rw-r--r-- 1 root root  391 Jun 27 15:20 etcd-csr.json
-rw------- 1 root root 1675 Jun 27 15:20 etcd-key.pem
-rw-r--r-- 1 root root 1509 Jun 27 15:20 etcd.pem

3.1.3 把证书复制到另外2个etcd节点上

mkdir -p /etc/etcd/ssl
cp etcd.pem etcd-key.pem ca.pem /etc/etcd/ssl/
ssh -n 10.12.96.5 "mkdir -p /etc/etcd/ssl && exit"
ssh -n 10.12.96.13 "mkdir -p /etc/etcd/ssl && exit"
scp -r /etc/etcd/ssl/*.pem 10.12.96.5:/etc/etcd/ssl/
scp -r /etc/etcd/ssl/*.pem 10.12.96.13:/etc/etcd/ssl/

3.2 安装etcd

k8s v1.10测试的是etcd 3.1.12的版本,所以我们也使用这个版本,发现这个版本在YUM源上没有,只能下载二进制包安装了。
以下操作需在3台etcd节点(也就是master)节点上执行:

wget https://github.com/coreos/etcd/releases/download/v3.1.12/etcd-v3.1.12-linux-amd64.tar.gz
tar -xvf etcd-v3.1.12-linux-amd64.tar.gz
mv etcd-v3.1.12-linux-amd64/etcd* /usr/local/bin

#* 创建 etcd 的 systemd unit 文件。
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \\
  --name \${ETCD_NAME} \\
  --cert-file=\${ETCD_CERT_FILE} \\
  --key-file=\${ETCD_KEY_FILE} \\
  --peer-cert-file=\${ETCD_PEER_CERT_FILE} \\
  --peer-key-file=\${ETCD_PEER_KEY_FILE} \\
  --trusted-ca-file=\${ETCD_TRUSTED_CA_FILE} \\
  --peer-trusted-ca-file=\${ETCD_PEER_TRUSTED_CA_FILE} \\
  --initial-advertise-peer-urls \${ETCD_INITIAL_ADVERTISE_PEER_URLS} \\
  --listen-peer-urls \${ETCD_LISTEN_PEER_URLS} \\
  --listen-client-urls \${ETCD_LISTEN_CLIENT_URLS} \\
  --advertise-client-urls \${ETCD_ADVERTISE_CLIENT_URLS} \\
  --initial-cluster-token \${ETCD_INITIAL_CLUSTER_TOKEN} \\
  --initial-cluster \${ETCD_INITIAL_CLUSTER} \\
  --initial-cluster-state new \\
  --data-dir=\${ETCD_DATA_DIR}
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

#etcd 配置文件,这是10.12.96.3节点的配置,另外两个etcd节点只要将上面的IP地址改成相应节点的IP地址即可。
#ETCD_NAME换成对应节点的infra1/2/3。
cat > /etc/etcd/etcd.conf <<EOF
ETCD_DATA_DIR="/var/lib/etcd/"
ETCD_LISTEN_PEER_URLS="https://10.12.96.3:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.12.96.3:2379,http://127.0.0.1:2379"
ETCD_NAME="infra1"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.12.96.3:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.12.96.3:2379"
ETCD_INITIAL_CLUSTER="infra1=https://10.12.96.3:2380,infra2=https://10.12.96.5:2380,infra3=https://10.12.96.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
EOF

#etcd 配置文件,这是10.12.96.5节点的配置。
cat > /etc/etcd/etcd.conf <<EOF
ETCD_DATA_DIR="/var/lib/etcd/"
ETCD_LISTEN_PEER_URLS="https://10.12.96.5:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.12.96.5:2379,http://127.0.0.1:2379"
ETCD_NAME="infra2"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.12.96.5:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.12.96.5:2379"
ETCD_INITIAL_CLUSTER="infra1=https://10.12.96.3:2380,infra2=https://10.12.96.5:2380,infra3=https://10.12.96.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
EOF

#etcd 配置文件,这是10.12.96.13节点的配置。
cat > /etc/etcd/etcd.conf <<EOF
ETCD_DATA_DIR="/var/lib/etcd/"
ETCD_LISTEN_PEER_URLS="https://10.12.96.13:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.12.96.13:2379,http://127.0.0.1:2379"
ETCD_NAME="infra3"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.12.96.13:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.12.96.13:2379"
ETCD_INITIAL_CLUSTER="infra1=https://10.12.96.3:2380,infra2=https://10.12.96.5:2380,infra3=https://10.12.96.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
EOF

mkdir /var/lib/etcd/
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd

在三个etcd节点的服务都起来后,执行一下命令检查状态,如果像以下输出,说明配置成功。

etcdctl \
--ca-file=/etc/etcd/ssl/ca.pem \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
cluster-health
-06-27 16:18:58.586432 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
-06-27 16:18:58.586965 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
member 8a0005e5779bed5f is healthy: got healthy result from https://10.12.96.13:2379
member 955000a144b11fb7 is healthy: got healthy result from https://10.12.96.5:2379
member d7001efc12afef41 is healthy: got healthy result from https://10.12.96.3:2379
cluster is healthy

4. master节点负载均衡配置

k8s master 的api server需要做负载均衡,否则会存在单点问题,在腾讯云上做负载均衡可以使用弹性IP加keepalived的方式,也可以使用腾讯云LB+haproxy的方式。

4.1 使用keepalived方式

我只测试了手动绑LB到其中一台master节点,配置keepalived,生产使用建议参考7.5,使VIP在故障时可以动态绑定。

4.1.1 安装keepalived

在3台master节点上安装keepalived:

yum install -y keepalived
systemctl enable keepalived

4.1.2 配置keepalived

cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
   router_id LVS_k8s
}

vrrp_script CheckK8sMaster {
    script "curl -k https://10.12.96.100:6443"
    interval 3
    timeout 9
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 61
    priority 100
    advert_int 1
    mcast_src_ip 10.12.96.3
    nopreempt
    authentication {
        auth_type PASS
        auth_pass kwi28x7sx37uw72
    }
    unicast_peer {
.12.96.5
.12.96.13
    }
    virtual_ipaddress {
.12.96.100/24
    }
    track_script {
        CheckK8sMaster
    }

}
EOF
cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
   router_id LVS_k8s
}

vrrp_script CheckK8sMaster {
    script "curl -k https://10.12.96.100:6443"
    interval 3
    timeout 9
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 61
    priority 90
    advert_int 1
    mcast_src_ip 10.12.96.5
    nopreempt
    authentication {
        auth_type PASS
        auth_pass kwi28x7sx37uw72
    }
    unicast_peer {
.12.96.3
.12.96.13
    }
    virtual_ipaddress {
.12.96.100/24
    }
    track_script {
        CheckK8sMaster
    }

}
EOF
cat <<EOF > /etc/keepalived/keepalived.conf
global_defs {
   router_id LVS_k8s
}

vrrp_script CheckK8sMaster {
    script "curl -k https://10.12.96.100:6443"
    interval 3
    timeout 9
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 61
    priority 80
    advert_int 1
    mcast_src_ip 10.12.96.13
    nopreempt
    authentication {
        auth_type PASS
        auth_pass kwi28x7sx37uw72
    }
    unicast_peer {
.12.96.3
.12.96.5
    }
    virtual_ipaddress {
.12.96.100/24
    }
    track_script {
        CheckK8sMaster
    }

}
EOF

4.1.3 启动验证keepalived

systemctl restart keepalived

[root@sh-saas-cvmk8s-master-01 k8s]# ip addr
: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 52:54:00:52:fe:50 brd ff:ff:ff:ff:ff:ff
    inet 10.12.96.3/24 brd 10.12.96.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.12.96.100/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:ed:6c:e3:ab brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
[root@sh-saas-cvmk8s-master-01 k8s]# 

可以看到VIP 10.12.96.100在sh-saas-cvmk8s-master-01上。
把sh-saas-cvmk8s-master-01关机,可以看到VIP到了sh-saas-cvmk8s-master-02上。

[root@sh-saas-cvmk8s-master-02 etcd]# ip addr
: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 52:54:00:13:65:cf brd ff:ff:ff:ff:ff:ff
    inet 10.12.96.5/24 brd 10.12.96.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.12.96.100/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:da:c8:18:2a brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

4.2 使用腾讯云LB+haproxy方式

腾讯云的LB有一个问题,当LB后端的真实服务器访问LB的VIP地址时,是不通的。
所以我们需要另外找2台服务器安装haproxy,然后把这2台服务器加入到腾讯云的LB。并在腾讯云LB上配置k8s api server TCP安全端口 6443端口及TCP非安全端口 8088监听(如果tencentcloud-cloud-controller-manager使用ServiceAccount方式,可不启用非安全端口监听)。
以下是在2台服务器上安装及配置haproxy的步骤:

yum install haproxy

vim /etc/haproxy/haproxy.cfg
#在最后增加:
listen  proxy_server_6443
    bind 0.0.0.0:6443
    mode tcp
    balance roundrobin
    server k8s_master1 10.12.96.3:6443 check inter 5000 rise 1 fall 2
    server k8s_master2 10.12.96.5:6443 check inter 5000 rise 1 fall 2
    server k8s_master3 10.12.96.13:6443 check inter 5000 rise 1 fall 2

listen  proxy_server_8088
    bind 0.0.0.0:8088
    mode tcp
    balance roundrobin
    server k8s_master1 10.12.96.3:8088 check inter 5000 rise 1 fall 2
    server k8s_master2 10.12.96.5:8088 check inter 5000 rise 1 fall 2
    server k8s_master3 10.12.96.13:8088 check inter 5000 rise 1 fall 2

#启动haproxy
/etc/init.d/haproxy start

5. 使用kubeadm安装及配置k8s集群

5.1 下载官方镜像

由于官方镜像在国内访问非常慢或不能访问,所以先把镜像拉到私有仓库:

docker pull anjia0532/pause-amd64:3.1
docker pull anjia0532/google-containers.kube-apiserver-amd64:v1.10.5
docker pull anjia0532/google-containers.kube-proxy-amd64:v1.10.5
docker pull anjia0532/google-containers.kube-scheduler-amd64:v1.10.5
docker pull anjia0532/google-containers.kube-controller-manager-amd64:v1.10.5
docker pull anjia0532/k8s-dns-sidecar-amd64:1.14.8
docker pull anjia0532/k8s-dns-kube-dns-amd64:1.14.8
docker pull anjia0532/k8s-dns-dnsmasq-nanny-amd64:1.14.8
docker pull anjia0532/kubernetes-dashboard-amd64:v1.8.3
docker pull anjia0532/heapster-influxdb-amd64:v1.3.3
docker pull anjia0532/heapster-grafana-amd64:v4.4.3
docker pull anjia0532/heapster-amd64:v1.5.3
docker pull anjia0532/etcd-amd64:3.1.12
docker pull anjia0532/fluentd-elasticsearch:v2.0.4
docker pull anjia0532/cluster-autoscaler:v1.2.0

docker tag anjia0532/pause-amd64:3.1 172.21.248.242/k8s/pause-amd64:3.1
docker tag anjia0532/google-containers.kube-apiserver-amd64:v1.10.5 172.21.248.242/k8s/kube-apiserver-amd64:v1.10.5
docker tag anjia0532/google-containers.kube-proxy-amd64:v1.10.5 172.21.248.242/k8s/kube-proxy-amd64:v1.10.5
docker tag anjia0532/google-containers.kube-scheduler-amd64:v1.10.5 172.21.248.242/k8s/kube-scheduler-amd64:v1.10.5
docker tag anjia0532/google-containers.kube-controller-manager-amd64:v1.10.5 172.21.248.242/k8s/kube-controller-manager-amd64:v1.10.5
docker tag anjia0532/k8s-dns-sidecar-amd64:1.14.8 172.21.248.242/k8s/k8s-dns-sidecar-amd64:1.14.8
docker tag anjia0532/k8s-dns-kube-dns-amd64:1.14.8 172.21.248.242/k8s/k8s-dns-kube-dns-amd64:1.14.8
docker tag anjia0532/k8s-dns-dnsmasq-nanny-amd64:1.14.8 172.21.248.242/k8s/k8s-dns-dnsmasq-nanny-amd64:1.14.8
docker tag anjia0532/kubernetes-dashboard-amd64:v1.8.3 172.21.248.242/k8s/kubernetes-dashboard-amd64:v1.8.3
docker tag anjia0532/heapster-influxdb-amd64:v1.3.3 172.21.248.242/k8s/heapster-influxdb-amd64:v1.3.3
docker tag anjia0532/heapster-grafana-amd64:v4.4.3 172.21.248.242/k8s/heapster-grafana-amd64:v4.4.3
docker tag anjia0532/heapster-amd64:v1.5.3 172.21.248.242/k8s/heapster-amd64:v1.5.3
docker tag anjia0532/etcd-amd64:3.1.12 172.21.248.242/k8s/etcd-amd64:3.1.12
docker tag anjia0532/fluentd-elasticsearch:v2.0.4 172.21.248.242/k8s/fluentd-elasticsearch:v2.0.4
docker tag anjia0532/cluster-autoscaler:v1.2.0 172.21.248.242/k8s/cluster-autoscaler:v1.2.0

docker push 172.21.248.242/k8s/pause-amd64:3.1
docker push 172.21.248.242/k8s/kube-apiserver-amd64:v1.10.5
docker push 172.21.248.242/k8s/kube-proxy-amd64:v1.10.5
docker push 172.21.248.242/k8s/kube-scheduler-amd64:v1.10.5
docker push 172.21.248.242/k8s/kube-controller-manager-amd64:v1.10.5
docker push 172.21.248.242/k8s/k8s-dns-sidecar-amd64:1.14.8
docker push 172.21.248.242/k8s/k8s-dns-kube-dns-amd64:1.14.8
docker push 172.21.248.242/k8s/k8s-dns-dnsmasq-nanny-amd64:1.14.8
docker push 172.21.248.242/k8s/kubernetes-dashboard-amd64:v1.8.3
docker push 172.21.248.242/k8s/heapster-influxdb-amd64:v1.3.3
docker push 172.21.248.242/k8s/heapster-grafana-amd64:v4.4.3
docker push 172.21.248.242/k8s/heapster-amd64:v1.5.3
docker push 172.21.248.242/k8s/etcd-amd64:3.1.12
docker push 172.21.248.242/k8s/fluentd-elasticsearch:v2.0.4
docker push 172.21.248.242/k8s/cluster-autoscaler:v1.2.0

5.2 安装kubelet kubeadm kubectl

在所有节点上安装:

yum install -y kubelet-1.10.5-0.x86_64 kubeadm-1.10.5-0.x86_64 kubectl-1.10.5-0.x86_64

所有节点修改kubelet配置文件
kubelet配置文件在/etc/systemd/system/kubelet.service.d/10-kubeadm.conf内
可通过新建一个/etc/systemd/system/kubelet.service.d/20-kubeadm.conf配置文件覆盖老的配置
kubeadm默认只支持cni网络,不支持kubenet,但是腾讯CCM需要kubenet,所以在这里配置成kubenet。
–hostname-override=10.12.96.3每台机器都修改成自己的IP

printf '[Service]\nEnvironment="KUBELET_NETWORK_ARGS=--network-plugin=kubenet"\nEnvironment="KUBELET_DNS_ARGS=--cluster-dns=10.254.255.10 --cluster-domain=cluster.local"\n' > /etc/systemd/system/kubelet.service.d/20-kubeadm.conf
printf 'Environment="KUBELET_EXTRA_ARGS=--v=2 --fail-swap-on=false --pod-infra-container-image=172.21.248.242/k8s/pause-amd64:3.1 --cloud-provider=external  --hostname-override=10.12.96.6"\n' >> /etc/systemd/system/kubelet.service.d/20-kubeadm.conf
printf 'Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"\n' >> /etc/systemd/system/kubelet.service.d/20-kubeadm.conf


最终配置:
[root@sh-saas-cvmk8s-master-01 ~]# cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS

[root@sh-saas-cvmk8s-master-01 ~]# cat /etc/systemd/system/kubelet.service.d/20-kubeadm.conf
[Service]
Environment="KUBELET_NETWORK_ARGS=--network-plugin=kubenet"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.254.255.10 --cluster-domain=cluster.local"
Environment="KUBELET_EXTRA_ARGS=--v=2 --fail-swap-on=false --pod-infra-container-image=172.21.248.242/k8s/pause-amd64:3.1 --cloud-provider=external  --hostname-override=10.12.96.2"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"

让配置生效:

systemctl daemon-reload
systemctl enable kubelet

启用命令行自动补全:

source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)
source <(kubeadm completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
echo "source <(kubeadm completion bash)" >> ~/.bashrc 

5.3 通过kubeadm配置集群

5.3.1 生成kubeadm配置文件

在master节点sh-saas-cvmk8s-master-01,sh-saas-cvmk8s-master-02,sh-saas-cvmk8s-master-03上生成kubeadm配置文件:

cat <<EOF > config.yaml 
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
cloudProvider: external
nodeName: 10.12.96.13
etcd:
  endpoints:
  - https://10.12.96.3:2379
  - https://10.12.96.5:2379
  - https://10.12.96.13:2379
  caFile: /etc/etcd/ssl/ca.pem
  certFile: /etc/etcd/ssl/etcd.pem
  keyFile: /etc/etcd/ssl/etcd-key.pem
  dataDir: /var/lib/etcd
kubernetesVersion: 1.10.5
api:
  advertiseAddress: "10.12.96.13"
  controlPlaneEndpoint: "10.12.16.101:8443"
token: "uirbod.udq3j3mhciwevjhh"
tokenTTL: "0s"
apiServerCertSANs:
- sh-saas-cvmk8s-master-01
- sh-saas-cvmk8s-master-02
- sh-saas-cvmk8s-master-03
- sh-saas-cvmk8s-master-04
- sh-saas-cvmk8s-master-05
- 10.12.96.3
- 10.12.96.5
- 10.12.96.13
- 10.12.16.101
- 10.12.96.10
- 10.12.96.11
- api.cvmk8s.dev.internal.huilog.com
apiServerExtraArgs:
  enable-swagger-ui: "true"
  insecure-bind-address: 0.0.0.0
  insecure-port: "8088"
networking:
  dnsDomain: cluster.local
  podSubnet: 10.254.0.0/16
  serviceSubnet: 10.254.255.0/24
featureGates:
  CoreDNS: true
imageRepository: "172.21.248.242/k8s"
EOF

注意:

5.3.2 在sh-saas-cvmk8s-master-01上初始化集群

[root@sh-saas-cvmk8s-master-01 k8s]# kubeadm init --config config.yaml                         
[init] Using Kubernetes version: v1.10.5
[init] Using Authorization modes: [Node RBAC]
[init] WARNING: For cloudprovider integrations to work --cloud-provider must be set for all kubelets in the cluster.
        (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf should be edited for this purpose)
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version.
Flag --insecure-port has been deprecated, This flag will be removed in a future version.
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [10.12.96.3 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local sh-saas-cvmk8s-master-01 sh-saas-cvmk8s-master-02 sh-saas-cvmk8s-master-03 sh-saas-cvmk8s-master-04 sh-saas-cvmk8s-master-05 api.cvmk8s.dev.internal.huilog.com] and IPs [10.254.255.1 10.12.16.101 10.12.96.3 10.12.96.5 10.12.96.13 10.12.16.101 10.12.96.10 10.12.96.11]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 12.506839 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node 10.12.96.3 as master by adding a label and a taint
[markmaster] Master 10.12.96.3 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: uirbod.udq3j3mhciwevjhh
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.12.16.101:6443 --token uirbod.udq3j3mhciwevjhh --discovery-token-ca-cert-hash sha256:8f9113bc0f118c3f4b675a2f0e3e18f0e091044ec74d5240b3c62e261cac6cae

按照提示,执行以下命令后,kubectl就可以直接运行了。

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

如果有问题,可以通过kubeadm reset重置到初始状态,然后再重新初始化

kubeadm reset

# kubeadm reset并不会初始化etcd上的数据信息,需要我们手动清一下,不清会有很多坑
systemctl stop etcd.service && rm -rf /var/lib/etcd/* && systemctl start etcd.service

# 然后可以重新执行创建
kubeadm init --config config.yaml

如果没有报错,可以看到以下信息:

[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get node -o wide
NAME         STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION              CONTAINER-RUNTIME
.12.96.3   Ready     master    3m        v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2

[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get all --all-namespaces -owide
NAMESPACE     NAME                                     READY     STATUS    RESTARTS   AGE       IP        NODE
kube-system   pod/coredns-7997f8864c-gcjrz             0/1       Pending   0          2m        <none>    <none>
kube-system   pod/coredns-7997f8864c-pfcxk             0/1       Pending   0          2m        <none>    <none>
kube-system   pod/kube-apiserver-10.12.96.3            1/1       Running   0          1m        <none>    10.12.96.3
kube-system   pod/kube-controller-manager-10.12.96.3   1/1       Running   0          2m        <none>    10.12.96.3
kube-system   pod/kube-proxy-7bdn2                     1/1       Running   0          2m        <none>    10.12.96.3
kube-system   pod/kube-scheduler-10.12.96.3            1/1       Running   0          2m        <none>    10.12.96.3

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE       SELECTOR
default       service/kubernetes   ClusterIP   10.254.255.1    <none>        443/TCP         3m        <none>
kube-system   service/kube-dns     ClusterIP   10.254.255.10   <none>        53/UDP,53/TCP   3m        k8s-app=kube-dns

NAMESPACE     NAME                        DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE       CONTAINERS   IMAGES                                        SELECTOR
kube-system   daemonset.apps/kube-proxy   1         1         1         1            1           <none>          3m        kube-proxy   172.21.248.242/k8s/kube-proxy-amd64:v1.10.5   k8s-app=kube-proxy

NAMESPACE     NAME                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE       CONTAINERS   IMAGES                  SELECTOR
kube-system   deployment.apps/coredns   2         2         2            0           3m        coredns      coredns/coredns:1.0.6   k8s-app=kube-dns

NAMESPACE     NAME                                 DESIRED   CURRENT   READY     AGE       CONTAINERS   IMAGES                  SELECTOR
kube-system   replicaset.apps/coredns-7997f8864c   2         2         0         2m        coredns      coredns/coredns:1.0.6   k8s-app=kube-dns,pod-template-hash=3553944207

5.3.3 其它二台master节点初始化

先把刚刚在master节点上生成的证书分发到其它的master节点,node节点不需要.

scp -r /etc/kubernetes/pki 10.12.96.3:/etc/kubernetes/
scp -r /etc/kubernetes/pki 10.12.96.13:/etc/kubernetes/

在其它二台master节点上初始化,注意kubeadm配置文件内nodeName需改成本机IP,tencentcloud-cloud-controller-manager要求节点名为IP地址:

[root@sh-saas-cvmk8s-master-02 k8s]# kubeadm init --config config.yaml
[init] Using Kubernetes version: v1.10.5
[init] Using Authorization modes: [Node RBAC]
[init] WARNING: For cloudprovider integrations to work --cloud-provider must be set for all kubelets in the cluster.
        (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf should be edited for this purpose)
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version.
Flag --insecure-port has been deprecated, This flag will be removed in a future version.
[preflight] Starting the kubelet service
[certificates] Using the existing ca certificate and key.
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing sa key.
[certificates] Using the existing front-proxy-ca certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 0.019100 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node 10.12.96.5 as master by adding a label and a taint
[markmaster] Master 10.12.96.5 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: uirbod.udq3j3mhciwevjhh
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.12.16.101:6443 --token uirbod.udq3j3mhciwevjhh --discovery-token-ca-cert-hash sha256:8f9113bc0f118c3f4b675a2f0e3e18f0e091044ec74d5240b3c62e261cac6cae

[root@sh-saas-cvmk8s-master-03 k8s]# kubeadm init --config config.yaml
[init] Using Kubernetes version: v1.10.5
[init] Using Authorization modes: [Node RBAC]
[init] WARNING: For cloudprovider integrations to work --cloud-provider must be set for all kubelets in the cluster.
        (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf should be edited for this purpose)
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version.
Flag --insecure-port has been deprecated, This flag will be removed in a future version.
[preflight] Starting the kubelet service
[certificates] Using the existing ca certificate and key.
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing sa key.
[certificates] Using the existing front-proxy-ca certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 0.020281 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node 10.12.96.13 as master by adding a label and a taint
[markmaster] Master 10.12.96.13 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: uirbod.udq3j3mhciwevjhh
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.12.16.101:6443 --token uirbod.udq3j3mhciwevjhh --discovery-token-ca-cert-hash sha256:8f9113bc0f118c3f4b675a2f0e3e18f0e091044ec74d5240b3c62e261cac6cae


我们可以发现生成的hash值是一样的。

现在查看k8s状态,可以看到有3个master节点了,而除coredns外,所有的容器比刚刚都多了2个,coredns本来就有2个。

[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get node -o wide                               
NAME          STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION              CONTAINER-RUNTIME
.12.96.13   Ready     master    34s       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.3    Ready     master    10m       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.5    Ready     master    1m        v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2

[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get all --all-namespaces -owide                
NAMESPACE     NAME                                      READY     STATUS    RESTARTS   AGE       IP        NODE
kube-system   pod/coredns-7997f8864c-gcjrz              0/1       Pending   0          9m        <none>    <none>
kube-system   pod/coredns-7997f8864c-pfcxk              0/1       Pending   0          9m        <none>    <none>
kube-system   pod/kube-apiserver-10.12.96.13            1/1       Running   0          38s       <none>    10.12.96.13
kube-system   pod/kube-apiserver-10.12.96.3             1/1       Running   0          8m        <none>    10.12.96.3
kube-system   pod/kube-apiserver-10.12.96.5             1/1       Running   0          1m        <none>    10.12.96.5
kube-system   pod/kube-controller-manager-10.12.96.13   1/1       Running   0          38s       <none>    10.12.96.13
kube-system   pod/kube-controller-manager-10.12.96.3    1/1       Running   0          9m        <none>    10.12.96.3
kube-system   pod/kube-controller-manager-10.12.96.5    1/1       Running   0          1m        <none>    10.12.96.5
kube-system   pod/kube-proxy-7bdn2                      1/1       Running   0          9m        <none>    10.12.96.3
kube-system   pod/kube-proxy-mxvjf                      1/1       Running   0          1m        <none>    10.12.96.5
kube-system   pod/kube-proxy-zvhm5                      1/1       Running   0          41s       <none>    10.12.96.13
kube-system   pod/kube-scheduler-10.12.96.13            1/1       Running   0          38s       <none>    10.12.96.13
kube-system   pod/kube-scheduler-10.12.96.3             1/1       Running   0          9m        <none>    10.12.96.3
kube-system   pod/kube-scheduler-10.12.96.5             1/1       Running   0          1m        <none>    10.12.96.5

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE       SELECTOR
default       service/kubernetes   ClusterIP   10.254.255.1    <none>        443/TCP         10m       <none>
kube-system   service/kube-dns     ClusterIP   10.254.255.10   <none>        53/UDP,53/TCP   10m       k8s-app=kube-dns

NAMESPACE     NAME                        DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE       CONTAINERS   IMAGES                                        SELECTOR
kube-system   daemonset.apps/kube-proxy   3         3         3         3            3           <none>          10m       kube-proxy   172.21.248.242/k8s/kube-proxy-amd64:v1.10.5   k8s-app=kube-proxy

NAMESPACE     NAME                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE       CONTAINERS   IMAGES                  SELECTOR
kube-system   deployment.apps/coredns   2         2         2            0           10m       coredns      coredns/coredns:1.0.6   k8s-app=kube-dns

NAMESPACE     NAME                                 DESIRED   CURRENT   READY     AGE       CONTAINERS   IMAGES                  SELECTOR
kube-system   replicaset.apps/coredns-7997f8864c   2         2         0         9m        coredns      coredns/coredns:1.0.6   k8s-app=kube-dns,pod-template-hash=3553944207
[root@sh-saas-cvmk8s-master-01 k8s]# 

PS:默认master不运行pod,通过以下命令可让master也运行pod,生产环境不建议这样使用。

kubectl taint nodes --all node-role.kubernetes.io/master-

5.3.4 node节点加入集群

在3台 node节点上,执行以下命令加入集群:

[root@sh-saas-cvmk8s-node-01 ~]# kubeadm join 10.12.16.101:6443 --token uirbod.udq3j3mhciwevjhh --discovery-token-ca-cert-hash sha256:8f9113bc0f118c3f4b675a2f0e3e18f0e091044ec74d5240b3c62e261cac6cae
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
        [WARNING Hostname]: hostname "sh-saas-cvmk8s-node-01" could not be reached
        [WARNING Hostname]: hostname "sh-saas-cvmk8s-node-01" lookup sh-saas-cvmk8s-node-01 on 183.60.83.19:53: no such host
        [WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "10.12.16.101:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.12.16.101:6443"
[discovery] Requesting info from "https://10.12.16.101:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.12.16.101:6443"
[discovery] Successfully established connection with API Server "10.12.16.101:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

[root@sh-saas-cvmk8s-node-02 ~]# kubeadm join 10.12.16.101:6443 --token uirbod.udq3j3mhciwevjhh --discovery-token-ca-cert-hash sha256:8f9113bc0f118c3f4b675a2f0e3e18f0e091044ec74d5240b3c62e261cac6cae
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
        [WARNING Hostname]: hostname "sh-saas-cvmk8s-node-02" could not be reached
        [WARNING Hostname]: hostname "sh-saas-cvmk8s-node-02" lookup sh-saas-cvmk8s-node-02 on 183.60.83.19:53: no such host
        [WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "10.12.16.101:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.12.16.101:6443"
[discovery] Requesting info from "https://10.12.16.101:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.12.16.101:6443"
[discovery] Successfully established connection with API Server "10.12.16.101:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

[root@sh-saas-cvmk8s-node-03 ~]# kubeadm join 10.12.16.101:6443 --token uirbod.udq3j3mhciwevjhh --discovery-token-ca-cert-hash sha256:8f9113bc0f118c3f4b675a2f0e3e18f0e091044ec74d5240b3c62e261cac6cae
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
        [WARNING Hostname]: hostname "sh-saas-cvmk8s-node-03" could not be reached
        [WARNING Hostname]: hostname "sh-saas-cvmk8s-node-03" lookup sh-saas-cvmk8s-node-03 on 183.60.83.19:53: no such host
        [WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "10.12.16.101:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.12.16.101:6443"
[discovery] Requesting info from "https://10.12.16.101:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.12.16.101:6443"
[discovery] Successfully established connection with API Server "10.12.16.101:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

# 现在可以看到3个node节点都加进来了
[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get node -o wide  
NAME          STATUS    ROLES     AGE       VERSION   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION              CONTAINER-RUNTIME
.12.96.13   Ready     master    17m       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.2    Ready     <none>    2m        v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.3    Ready     master    26m       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.4    Ready     <none>    43s       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.5    Ready     master    18m       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2
.12.96.6    Ready     <none>    39s       v1.10.5   <none>        CentOS Linux 7 (Core)   3.10.0-862.3.3.el7.x86_64   docker://17.6.2

[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get all --all-namespaces -owide
NAMESPACE     NAME                                      READY     STATUS    RESTARTS   AGE       IP        NODE
kube-system   pod/coredns-7997f8864c-gcjrz              0/1       Pending   0          1h        <none>    <none>
kube-system   pod/coredns-7997f8864c-pfcxk              0/1       Pending   0          1h        <none>    <none>
kube-system   pod/kube-apiserver-10.12.96.13            1/1       Running   0          1h        <none>    10.12.96.13
kube-system   pod/kube-apiserver-10.12.96.3             1/1       Running   0          1h        <none>    10.12.96.3
kube-system   pod/kube-apiserver-10.12.96.5             1/1       Running   0          1h        <none>    10.12.96.5
kube-system   pod/kube-controller-manager-10.12.96.13   1/1       Running   0          1h        <none>    10.12.96.13
kube-system   pod/kube-controller-manager-10.12.96.3    1/1       Running   0          1h        <none>    10.12.96.3
kube-system   pod/kube-controller-manager-10.12.96.5    1/1       Running   0          1h        <none>    10.12.96.5
kube-system   pod/kube-proxy-7bdn2                      1/1       Running   0          1h        <none>    10.12.96.3
kube-system   pod/kube-proxy-7tz77                      1/1       Running   0          50m       <none>    10.12.96.2
kube-system   pod/kube-proxy-fcq6t                      1/1       Running   0          49m       <none>    10.12.96.6
kube-system   pod/kube-proxy-fxstm                      1/1       Running   0          49m       <none>    10.12.96.4
kube-system   pod/kube-proxy-mxvjf                      1/1       Running   0          1h        <none>    10.12.96.5
kube-system   pod/kube-proxy-zvhm5                      1/1       Running   0          1h        <none>    10.12.96.13
kube-system   pod/kube-scheduler-10.12.96.13            1/1       Running   0          1h        <none>    10.12.96.13
kube-system   pod/kube-scheduler-10.12.96.3             1/1       Running   0          1h        <none>    10.12.96.3
kube-system   pod/kube-scheduler-10.12.96.5             1/1       Running   0          1h        <none>    10.12.96.5

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE       SELECTOR
default       service/kubernetes   ClusterIP   10.254.255.1    <none>        443/TCP         1h        <none>
kube-system   service/kube-dns     ClusterIP   10.254.255.10   <none>        53/UDP,53/TCP   1h        k8s-app=kube-dns

NAMESPACE     NAME                        DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE       CONTAINERS   IMAGES                                        SELECTOR
kube-system   daemonset.apps/kube-proxy   6         6         6         6            6           <none>          1h        kube-proxy   172.21.248.242/k8s/kube-proxy-amd64:v1.10.5   k8s-app=kube-proxy

NAMESPACE     NAME                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE       CONTAINERS   IMAGES                  SELECTOR
kube-system   deployment.apps/coredns   2         2         2            0           1h        coredns      coredns/coredns:1.0.6   k8s-app=kube-dns

NAMESPACE     NAME                                 DESIRED   CURRENT   READY     AGE       CONTAINERS   IMAGES                  SELECTOR
kube-system   replicaset.apps/coredns-7997f8864c   2         2         0         1h        coredns      coredns/coredns:1.0.6   k8s-app=kube-dns,pod-template-hash=3553944207

可以看到3个node节点已经加入到集群内了。

但是容器的IP都是none,coredns也是Pending状态,这是因为tencent cloud controller manager没有部署,网络还不通造成的。

5.4 部署tencent cloud controller manager

5.4.1 编译安装tencent cloud controller manager

只需要一台机器上执行:

yum install -y git go

mkdir -p /root/go/src/github.com/dbdd4us/
git clone https://github.com/dbdd4us/tencentcloud-cloud-controller-manager.git /root/go/src/github.com/dbdd4us/tencentcloud-cloud-controller-manager

#编译tencentcloud-cloud-controller-manager:
cd /root/go/src/github.com/dbdd4us/tencentcloud-cloud-controller-manager
go build -v

#编译tencentcloud-cloud-controller-manager route-ctl:
cd /root/go/src/github.com/dbdd4us/tencentcloud-cloud-controller-manager/route-ctl
go build -v

#安装route-ctl
cp /root/go/src/github.com/dbdd4us/tencentcloud-cloud-controller-manager/route-ctl/route-ctl /usr/local/bin/

5.4.2 创建路由表

参考: https://github.com/dbdd4us/tencentcloud-cloud-controller-manager/blob/master/route-ctl/README.md
10.254.0.0/16这个容器网段需要避免这些冲突:

配置route-ctl所需的环境变量并创建路由:

export QCloudSecretId=xxxxxxxxxx
export QCloudSecretKey=xxxxxxxxxx
export QCloudCcsAPIRegion=ap-shanghai

[root@sh-saas-cvmk8s-master-01 k8s]# route-ctl create --route-table-cidr-block 10.254.0.0/16 --route-table-name route-table-cvmk8s --vpc-id vpc-n0fkh7v2
INFO[0000] GET https://ccs.api.qcloud.com/v2/index.php?Action=CreateClusterRouteTable&IgnoreClusterCidrConflict=0&Nonce=5577006791947779410&Region=ap-shanghai&RouteTableCidrBlock=10.254.0.0%2F16&RouteTableName=route-table-cvmk8s&SecretId=xxxxxxxxxx&Signature=REHgrQ2c7C81gQZ%2BW3xJ6MlBX2VhWEZE20b%2BZ9ZagnU%3D&SignatureMethod=HmacSHA256&Timestamp=1530264382&VpcId=vpc-n0fkh7v2 200 {"code":0,"message":"","codeDesc":"Success","data":null}  Action=CreateClusterRouteTable
[root@sh-saas-cvmk8s-master-01 k8s]#
[root@sh-saas-cvmk8s-master-01 k8s]# route-ctl list
RouteTableName            RouteTableCidrBlock        VpcId               
route-table-cvmk8s        10.254.0.0/16              vpc-n0fkh7v2        

5.4.3 创建tencent-cloud-controller-manager configmap

[root@sh-saas-cvmk8s-master-01 k8s]# echo -n "ap-shanghai" | base64
YXAtc2hhbmdoYWk=
[root@sh-saas-cvmk8s-master-01 k8s]# echo -n "xxxxxxxxxx" | base64
xxxxxxxxxx
[root@sh-saas-cvmk8s-master-01 k8s]# echo -n "xxxxxxxxxx" | base64
xxxxxxxxxx
[root@sh-saas-cvmk8s-master-01 k8s]# echo -n "route-table-cvmk8s" | base64
cm91dGUtdGFibGUtY3Ztazhz
[root@sh-saas-cvmk8s-master-01 k8s]#


cat <<EOF > tencent-cloud-controller-manager-configmap.yaml 
apiVersion: v1
kind: Secret
metadata:
  name: tencentcloud-cloud-controller-manager-config
  namespace: kube-system
data:
  # 需要注意的是,secret 的 value 需要进行 base64 编码
  #   echo -n "<REGION>" | base64
  TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_REGION: "YXAtc2hhbmdoYWk="
  TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_ID: "xxxxxxxxxx"
  TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_KEY: "xxxxxxxxxx=" 
  TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_CLUSTER_ROUTE_TABLE: "cm91dGUtdGFibGUtY3Ztazhz" 
EOF
kubectl create -f  tencent-cloud-controller-manager-configmap.yaml 

5.4.4 创建tencent-cloud-controller-manager Deployment

tencent-cloud-controller-manager可以使用http或https两种方式与k8s master api server交互。

生产环境建议使用https方式,可通过配置serviceAccount来使用,更加安全。同时关闭api server的非安全端口。
kubeadm配置文件内更改以下两个参数可以关闭api server的非安全端口:

  insecure-bind-address: 0.0.0.0
  insecure-port: "0"
5.4.4.1 tencent-cloud-controller-manager http方式部署
cat <<EOF > tencent-cloud-controller-manager.yaml 
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tencentcloud-cloud-controller-manager
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 2
  template:
    metadata:
      labels:
        app: tencentcloud-cloud-controller-manager
    spec:
      dnsPolicy: Default
      tolerations:
        - key: "node.cloudprovider.kubernetes.io/uninitialized"
          value: "true"
          effect: "NoSchedule"
        - key: "node.kubernetes.io/network-unavailable"
          value: "true"
          effect: "NoSchedule"
      containers:
      - image: ccr.ccs.tencentyun.com/library/tencentcloud-cloud-controller-manager:latest
        name: tencentcloud-cloud-controller-manager
        command:
          - /bin/tencentcloud-cloud-controller-manager
          - --cloud-provider=tencentcloud # 指定 cloud provider 为 tencentcloud
          - --allocate-node-cidrs=true # 指定 cloud provider 为 tencentcloud 为 node 分配 cidr
          - --cluster-cidr=10.254.0.0/16 # 集群 pod 所在网络,需要提前创建
          - --master=http://10.12.16.101:8088/ # master 的非 https api 地址
          - --cluster-name=kubernetes # 集群名称
          - --configure-cloud-routes=true
          - --allow-untagged-cloud=true
        env:
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_REGION
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_REGION
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_ID
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_ID
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_KEY
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_KEY
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_CLUSTER_ROUTE_TABLE
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_CLUSTER_ROUTE_TABLE
EOF

kubectl create -f  tencent-cloud-controller-manager.yaml 
5.4.4.2 tencent-cloud-controller-manager https方式部署
#创建serviceAccount
cat <<EOF > cloud-controller-manager.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cloud-controller-manager
  namespace: kube-system
EOF

#创建role
cat <<EOF > cloud-controller-manager-roles.yaml
apiVersion: v1
items:
- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRole
  metadata:
    name: system:cloud-controller-manager
  rules:
  - apiGroups:
    - ""
    resources:
    - events
    verbs:
    - create
    - patch
    - update
  - apiGroups:
    - ""
    resources:
    - nodes
    verbs:
    - '*'
  - apiGroups:
    - ""
    resources:
    - nodes/status
    verbs:
    - patch
  - apiGroups:
    - ""
    resources:
    - services
    verbs:
    - list
    - patch
    - update
    - watch
  - apiGroups:
    - ""
    resources:
    - serviceaccounts
    verbs:
    - create
  - apiGroups:
    - ""
    resources:
    - persistentvolumes
    verbs:
    - get
    - list
    - update
    - watch
  - apiGroups:
    - ""
    resources:
    - endpoints
    verbs:
    - create
    - get
    - list
    - watch
    - update
- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRole
  metadata:
    name: system:cloud-node-controller
  rules:
  - apiGroups:
    - ""
    resources:
    - nodes
    verbs:
    - delete
    - get
    - patch
    - update
    - list
  - apiGroups:
    - ""
    resources:
    - nodes/status
    verbs:
    - patch
  - apiGroups:
    - ""
    resources:
    - events
    verbs:
    - create
    - patch
    - update
- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRole
  metadata:
    name: system:pvl-controller
  rules:
  - apiGroups:
    - ""
    resources:
    - persistentvolumes
    verbs:
    - get
    - list
    - watch
  - apiGroups:
    - ""
    resources:
    - events
    verbs:
    - create
    - patch
    - update
kind: List
metadata: {}
EOF

#创建role bind
cat <<EOF > cloud-controller-manager-role-bindings.yaml
apiVersion: v1
items:
- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRoleBinding
  metadata:
    name: system:cloud-node-controller
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: system:cloud-node-controller
  subjects:
  - kind: ServiceAccount
    name: cloud-node-controller
    namespace: kube-system
- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRoleBinding
  metadata:
    name: system:pvl-controller
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: system:pvl-controller
  subjects:
  - kind: ServiceAccount
    name: pvl-controller
    namespace: kube-system
- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRoleBinding
  metadata:
    name: system:cloud-controller-manager
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: cluster-admin
  subjects:
  - kind: ServiceAccount
    name: cloud-controller-manager
    namespace: kube-system
kind: List
metadata: {}
EOF

cat <<EOF > tencent-cloud-controller-manager-https.yaml 
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tencentcloud-cloud-controller-manager
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 2
  template:
    metadata:
      labels:
        app: tencentcloud-cloud-controller-manager
    spec:
      serviceAccountName: cloud-controller-manager
      dnsPolicy: Default
      tolerations:
        - key: "node.cloudprovider.kubernetes.io/uninitialized"
          value: "true"
          effect: "NoSchedule"
        - key: "node.kubernetes.io/network-unavailable"
          value: "true"
          effect: "NoSchedule"
      containers:
      - image: ccr.ccs.tencentyun.com/library/tencentcloud-cloud-controller-manager:latest
        name: tencentcloud-cloud-controller-manager
        command:
          - /bin/tencentcloud-cloud-controller-manager
          - --cloud-provider=tencentcloud # 指定 cloud provider 为 tencentcloud
          - --allocate-node-cidrs=true # 指定 cloud provider 为 tencentcloud 为 node 分配 cidr
          - --cluster-cidr=10.254.0.0/16 # 集群 pod 所在网络,需要提前创建
          - --cluster-name=kubernetes # 集群名称
          - --use-service-account-credentials
          - --configure-cloud-routes=true
          - --allow-untagged-cloud=true
        env:
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_REGION
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_REGION
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_ID
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_ID
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_KEY
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_SECRET_KEY
          - name: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_CLUSTER_ROUTE_TABLE
            valueFrom:
              secretKeyRef:
                name: tencentcloud-cloud-controller-manager-config
                key: TENCENTCLOUD_CLOUD_CONTROLLER_MANAGER_CLUSTER_ROUTE_TABLE
EOF


kubectl create -f  cloud-controller-manager.yaml
kubectl create -f  cloud-controller-manager-roles.yaml
kubectl create -f  cloud-controller-manager-role-bindings.yaml
kubectl create -f  tencent-cloud-controller-manager-https.yaml

过个一两分钟后,我们可以看到coredns pods都成Running状态,容器的IP也都有了。

[root@sh-saas-cvmk8s-master-01 k8s]# kubectl get all --all-namespaces -owide
NAMESPACE     NAME                                                         READY     STATUS    RESTARTS   AGE       IP            NODE
kube-system   pod/coredns-7997f8864c-gcjrz                                 1/1       Running   0          1h        10.254.2.2    10.12.96.13
kube-system   pod/coredns-7997f8864c-pfcxk                                 1/1       Running   0          1h        10.254.4.2    10.12.96.4
kube-system   pod/kube-apiserver-10.12.96.13                               1/1       Running   0          1h        10.12.96.13   10.12.96.13
kube-system   pod/kube-apiserver-10.12.96.3                                1/1       Running   0          1h        10.12.96.3    10.12.96.3
kube-system   pod/kube-apiserver-10.12.96.5                                1/1       Running   0          1h        10.12.96.5    10.12.96.5
kube-system   pod/kube-controller-manager-10.12.96.13                      1/1       Running   0          1h        10.12.96.13   10.12.96.13
kube-system   pod/kube-controller-manager-10.12.96.3                       1/1       Running   0          1h        10.12.96.3    10.12.96.3
kube-system   pod/kube-controller-manager-10.12.96.5                       1/1       Running   0          1h        10.12.96.5    10.12.96.5
kube-system   pod/kube-proxy-7bdn2                                         1/1       Running   0          1h        10.12.96.3    10.12.96.3
kube-system   pod/kube-proxy-7tz77                                         1/1       Running   0          1h        10.12.96.2    10.12.96.2
kube-system   pod/kube-proxy-fcq6t                                         1/1       Running   0          1h        10.12.96.6    10.12.96.6
kube-system   pod/kube-proxy-fxstm                                         1/1       Running   0          1h        10.12.96.4    10.12.96.4
kube-system   pod/kube-proxy-mxvjf                                         1/1       Running   0          1h        10.12.96.5    10.12.96.5
kube-system   pod/kube-proxy-zvhm5                                         1/1       Running   0          1h        10.12.96.13   10.12.96.13
kube-system   pod/kube-scheduler-10.12.96.13                               1/1       Running   0          1h        10.12.96.13   10.12.96.13
kube-system   pod/kube-scheduler-10.12.96.3                                1/1       Running   0          1h        10.12.96.3    10.12.96.3
kube-system   pod/kube-scheduler-10.12.96.5                                1/1       Running   0          1h        10.12.96.5    10.12.96.5
kube-system   pod/tencentcloud-cloud-controller-manager-5bd497dc5b-8qwt9   1/1       Running   0          3m        10.254.5.5    10.12.96.6

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE       SELECTOR
default       service/kubernetes   ClusterIP   10.254.255.1    <none>        443/TCP         1h        <none>
kube-system   service/kube-dns     ClusterIP   10.254.255.10   <none>        53/UDP,53/TCP   1h        k8s-app=kube-dns

NAMESPACE     NAME                        DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE       CONTAINERS   IMAGES                                        SELECTOR
kube-system   daemonset.apps/kube-proxy   6         6         6         6            6           <none>          1h        kube-proxy   172.21.248.242/k8s/kube-proxy-amd64:v1.10.5   k8s-app=kube-proxy

NAMESPACE     NAME                                                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE       CONTAINERS                              IMAGES                                                                        SELECTOR
kube-system   deployment.apps/coredns                                 2         2         2            2           1h        coredns                                 coredns/coredns:1.0.6                                                         k8s-app=kube-dns
kube-system   deployment.apps/tencentcloud-cloud-controller-manager   1         1         1            1           3m        tencentcloud-cloud-controller-manager   ccr.ccs.tencentyun.com/library/tencentcloud-cloud-controller-manager:latest   app=tencentcloud-cloud-controller-manager

NAMESPACE     NAME                                                               DESIRED   CURRENT   READY     AGE       CONTAINERS                              IMAGES                                                                        SELECTOR
kube-system   replicaset.apps/coredns-7997f8864c                                 2         2         2         1h        coredns                                 coredns/coredns:1.0.6                                                         k8s-app=kube-dns,pod-template-hash=3553944207
kube-system   replicaset.apps/tencentcloud-cloud-controller-manager-5bd497dc5b   1         1         1         3m        tencentcloud-cloud-controller-manager   ccr.ccs.tencentyun.com/library/tencentcloud-cloud-controller-manager:latest   app=tencentcloud-cloud-controller-manager,pod-template-hash=1680538716

5.4.5 容器内通信测试

通过下面的命令可进入容器,通过ping其它节点上的容器IP,以及公网IP、容器外的测试机IP等。发现都已经连通。

kubectl exec tencentcloud-cloud-controller-manager-5bd497dc5b-8qwt9  -n=kube-system -it -- sh
/ # ping 10.254.2.2 
PING 10.254.2.2 (10.254.2.2): 56 data bytes
 bytes from 10.254.2.2: seq=0 ttl=62 time=0.318 ms
 bytes from 10.254.2.2: seq=1 ttl=62 time=0.312 ms
^C
--- 10.254.2.2 ping statistics ---
 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.312/0.315/0.318 ms
/ # ping www.163.com
PING www.163.com (59.81.64.174): 56 data bytes
 bytes from 59.81.64.174: seq=0 ttl=52 time=5.433 ms
 bytes from 59.81.64.174: seq=1 ttl=52 time=5.398 ms
^C
--- www.163.com ping statistics ---
 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 5.398/5.415/5.433 ms
/ # ping 172.21.251.111
PING 172.21.251.111 (172.21.251.111): 56 data bytes
 bytes from 172.21.251.111: seq=0 ttl=60 time=2.379 ms
 bytes from 172.21.251.111: seq=1 ttl=60 time=2.229 ms
 bytes from 172.21.251.111: seq=2 ttl=60 time=2.211 ms
^C
--- 172.21.251.111 ping statistics ---
 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 2.211/2.273/2.379 ms
/ # 

6. 其它kubernetes-dashboard等插件的安装及配置

6.1 kubernetes-dashboard

kubernetes 1.10测试过的版本为:v1.8.3

[root@sh-saas-cvmk8s-master-01 k8s]# wget https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
[root@sh-saas-cvmk8s-master-01 k8s]# kubectl create -f kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role.rbac.authorization.k8s.io "kubernetes-dashboard-minimal" created
rolebinding.rbac.authorization.k8s.io "kubernetes-dashboard-minimal" created
deployment.apps "kubernetes-dashboard" created
service "kubernetes-dashboard" created

访问方式为:https://10.12.16.101:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/,打开后就报错了。

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "services \"https:kubernetes-dashboard:\" is forbidden: User \"system:anonymous\" cannot get services/proxy in the namespace \"kube-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "https:kubernetes-dashboard:",
    "kind": "services"
  },
  "code": 403
}

用以下命令生成证书,并导入到浏览器上。

[root@sh-saas-cvmk8s-master-01 k8s]# cd /etc/kubernetes/pki/
[root@sh-saas-cvmk8s-master-01 pki]# ls
apiserver.crt  apiserver-kubelet-client.crt  ca.crt  front-proxy-ca.crt  front-proxy-client.crt  sa.key
apiserver.key  apiserver-kubelet-client.key  ca.key  front-proxy-ca.key  front-proxy-client.key  sa.pub
[root@sh-saas-cvmk8s-master-01 pki]# openssl pkcs12 -export -in apiserver-kubelet-client.crt  -out apiserver-kubelet-client.p12 -inkey apiserver-kubelet-client.key
Enter Export Password:
Verifying - Enter Export Password:
[root@sh-saas-cvmk8s-master-01 pki]# ls
apiserver.crt  apiserver-kubelet-client.crt  apiserver-kubelet-client.p12  ca.key              front-proxy-ca.key      front-proxy-client.key  sa.pub
apiserver.key  apiserver-kubelet-client.key  ca.crt                        front-proxy-ca.crt  front-proxy-client.crt  sa.key
[root@sh-saas-cvmk8s-master-01 pki]# sz apiserver-kubelet-client.p12 

这次终于看到了页面:

下面我们需要创建一个用户,然后才能找到这个用户的token登入:

# 创建一个ServiceAccount
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kube-system
EOF

#再创建一个ClusterRoleBinding,给admin-user相应的权限
cat <<EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kube-system
EOF

# 查找admin-user的token
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')

用admin-user的token登入,如图:

6.2 heapster安装配置

wget https://github.com/kubernetes/heapster/archive/v1.5.3.tar.gz
tar xvzf v1.5.3.tar.gz
cd heapster-1.5.3/

部署文件目录:heapster-1.5.3/deploy/kube-config/influxdb
修改配置文件内的image为本地。
heapster写好了一个kube.sh脚本来导入配置,另外记得要导入rbac/heapster-rbac.yaml,否则heapster会没有权限:

[root@sh-saas-cvmk8s-master-01 deploy]# cd heapster-1.5.3/deploy/ 
[root@sh-saas-cvmk8s-master-01 deploy]# ls
docker  kube-config  kube.sh
[root@sh-saas-cvmk8s-master-01 deploy]# sh kube.sh 
Usage: kube.sh {start|stop|restart}
[root@sh-saas-cvmk8s-master-01 deploy]# sh kube.sh start
heapster pods have been setup
[root@sh-saas-cvmk8s-master-01 deploy]# cd kube-config
[root@sh-saas-cvmk8s-master-01 kube-config]# kubectl create -f rbac/heapster-rbac.yaml
clusterrolebinding.rbac.authorization.k8s.io "heapster" created
[root@sh-saas-cvmk8s-master-01 deploy]# kubectl get all -n=kube-system 
NAME                                                         READY     STATUS    RESTARTS   AGE
pod/coredns-7997f8864c-gcjrz                                 1/1       Running   0          23h
pod/coredns-7997f8864c-pfcxk                                 1/1       Running   0          23h
pod/heapster-65c94cfd48-xg77q                                1/1       Running   0          6s
pod/kube-apiserver-10.12.96.13                               1/1       Running   0          23h
pod/kube-apiserver-10.12.96.3                                1/1       Running   0          23h
pod/kube-apiserver-10.12.96.5                                1/1       Running   0          23h
pod/kube-controller-manager-10.12.96.13                      1/1       Running   0          23h
pod/kube-controller-manager-10.12.96.3                       1/1       Running   0          23h
pod/kube-controller-manager-10.12.96.5                       1/1       Running   0          23h
pod/kube-proxy-7bdn2                                         1/1       Running   0          23h
pod/kube-proxy-7tz77                                         1/1       Running   0          23h
pod/kube-proxy-fcq6t                                         1/1       Running   0          23h
pod/kube-proxy-fxstm                                         1/1       Running   0          23h
pod/kube-proxy-mxvjf                                         1/1       Running   0          23h
pod/kube-proxy-zvhm5                                         1/1       Running   0          23h
pod/kube-scheduler-10.12.96.13                               1/1       Running   0          23h
pod/kube-scheduler-10.12.96.3                                1/1       Running   0          23h
pod/kube-scheduler-10.12.96.5                                1/1       Running   0          23h
pod/kubernetes-dashboard-85d6464777-2xv4w                    1/1       Running   0          2h
pod/monitoring-grafana-5c5b5656d4-2rg26                      1/1       Running   0          6s
pod/monitoring-influxdb-5f7597c95d-6lzpk                     1/1       Running   0          6s
pod/tencentcloud-cloud-controller-manager-5674868749-fn5p9   1/1       Running   0          51m

NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
service/heapster               ClusterIP   10.254.255.236   <none>        80/TCP          6s
service/kube-dns               ClusterIP   10.254.255.10    <none>        53/UDP,53/TCP   23h
service/kubernetes-dashboard   ClusterIP   10.254.255.206   <none>        443/TCP         2h
service/monitoring-grafana     ClusterIP   10.254.255.110   <none>        80/TCP          14m
service/monitoring-influxdb    ClusterIP   10.254.255.24    <none>        8086/TCP        6s

NAME                        DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/kube-proxy   6         6         6         6            6           <none>          23h

NAME                                                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns                                 2         2         2            2           23h
deployment.apps/heapster                                1         1         1            1           6s
deployment.apps/kubernetes-dashboard                    1         1         1            1           2h
deployment.apps/monitoring-grafana                      1         1         1            1           6s
deployment.apps/monitoring-influxdb                     1         1         1            1           6s
deployment.apps/tencentcloud-cloud-controller-manager   1         1         1            1           51m

NAME                                                               DESIRED   CURRENT   READY     AGE
replicaset.apps/coredns-7997f8864c                                 2         2         2         23h
replicaset.apps/heapster-65c94cfd48                                1         1         1         6s
replicaset.apps/kubernetes-dashboard-85d6464777                    1         1         1         2h
replicaset.apps/monitoring-grafana-5c5b5656d4                      1         1         1         6s
replicaset.apps/monitoring-influxdb-5f7597c95d                     1         1         1         6s
replicaset.apps/tencentcloud-cloud-controller-manager-5674868749   1         1         1         51m

效果图如下,可以看到比上面的图多了CPU及内存等监控信息:

访问 grafana:

[root@sh-saas-cvmk8s-master-01 deploy]# kubectl cluster-info
Kubernetes master is running at https://10.12.16.101:6443
Heapster is running at https://10.12.16.101:6443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://10.12.16.101:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://10.12.16.101:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://10.12.16.101:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

用浏览器打开:https://10.12.16.101:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy,可看到下图:

里面有K8s集群以及各个pod的监控信息.

7. 容器VPC网段与公司网段互通

公司网段为172.21.0.0/12,与腾讯云VPC网段通过专线BGP协议互通。通过前面搭建起来的集群后测试发现一个现象:

通过与公司的网络工程师沟通,发现公司网段到容器网段上公司侧的路由器上BGP没有学到到容器网段10.254.0.0/16的路由,所以不通。
而容器网段到公司网段通是因为我们kubernetes集群使用的是kubenet的网桥模式,kubelet的启动参数有一个–non-masquerade-cidr=10.0.0.0/8,这个参数表示网络流量出去不做NAT的网段。所以10.0.0.0/8以外的网段都通过节点的NAT转换到节点的IP出去的。所以会通。

搞清楚了问题,那么就知道怎么解决了,先解决让公司网段不做NAT的问题,再解决BGP路由表学不到容器网段路由的问题。

7.1 使用ip-masq-agent解决多网段不做NAT转换的问题

我们发现–non-masquerade-cidr=10.0.0.0/8不能做多个网段,解决这个问题需要使用第三方的手段,于是找到了ip-masq-agent。其原理是通过DaemonSet方式部署到kubernetes上的每个节点上来更改相应iptables规则实现。
ip-masq-agent项目网址为: https://github.com/kubernetes-incubator/ip-masq-agent

docker pull anjia0532/google-containers.ip-masq-agent-amd64:v2.0.2
docker tag anjia0532/google-containers.ip-masq-agent-amd64:v2.0.2 ccr.ccs.tencentyun.com/huilog-public/ip-masq-agent-amd64:v2.0.2
docker push ccr.ccs.tencentyun.com/huilog-public/ip-masq-agent-amd64:v2.0.2
printf '[Service]\nEnvironment="KUBELET_NETWORK_ARGS=--network-plugin=kubenet --non-masquerade-cidr=0.0.0.0/0"\n' > /etc/systemd/system/kubelet.service.d/30-kubeadm.conf

[root@sh-saas-cvmk8s-node-01 ~]# systemctl daemon-reload

[root@sh-saas-cvmk8s-node-01 ~]# ps auxw |grep kubelet
root     19743  0.0  0.0 112704   972 pts/0    S+   14:26   0:00 grep --color=auto kubelet
root     28391  1.3  0.1 2388928 100344 ?      Ssl  Jul04 263:44 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=kubenet --cluster-dns=10.254.255.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --cgroup-driver=cgroupfs --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki --v=2 --fail-swap-on=false --pod-infra-container-image=172.21.248.242/k8s/pause-amd64:3.1 --cloud-provider=external --hostname-override=10.12.96.2

[root@sh-saas-cvmk8s-node-01 ~]# systemctl stop kubelet
[root@sh-saas-cvmk8s-node-01 ~]# iptables -F && iptables -t nat -F && iptables -Z
[root@sh-saas-cvmk8s-node-01 ~]# systemctl start kubelet
[root@sh-saas-cvmk8s-node-01 ~]# /sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
[root@sh-saas-cvmk8s-node-01 ~]# ps auxw |grep kubelet
root     19832 14.2  0.0 1590676 70980 ?       Ssl  14:27   0:00 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=kubenet --non-masquerade-cidr=0.0.0.0/0 --cluster-dns=10.254.255.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --cgroup-driver=cgroupfs --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki --v=2 --fail-swap-on=false --pod-infra-container-image=172.21.248.242/k8s/pause-amd64:3.1 --cloud-provider=external --hostname-override=10.12.96.2
root     19975  0.0  0.0 112704   972 pts/0    S+   14:27   0:00 grep --color=auto kubelet

重启kubelet后,可以看到–non-masquerade-cidr=0.0.0.0/0参数已经加上了。
重启kubelet不会影响POD的正常运行。
iptables -F && iptables -t nat -F && iptables -Z会影响网络,但是不执行这一步,kubelet不会重新生成iptables规则。
/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT 需要执行,否则后面跟公司网段会不通

可以ping可以发现,现在已经ping不通公司网络了,说明之前推测的没错,确实是通过NAT出去的。

[root@sh-saas-cvmk8s-master-01 ip-masq-agent]# kubectl exec -it elasticsearch-logging-1  -n=kube-system -- sh                                                       
sh-4.2# ping 172.21.251.111
PING 172.21.251.111 (172.21.251.111) 56(84) bytes of data.
^C
--- 172.21.251.111 ping statistics ---
 packets transmitted, 0 received, 100% packet loss, time 6999ms

wget https://raw.githubusercontent.com/kubernetes-incubator/ip-masq-agent/master/ip-masq-agent.yaml

#修改yaml文件,将image修改为:ccr.ccs.tencentyun.com/huilog-public/ip-masq-agent-amd64:v2.0.2,然后再部署

# 编写配置文件
# 参考:https://raw.githubusercontent.com/kubernetes-incubator/ip-masq-agent/v2.0.1/agent-config/config
# nonMasqueradeCIDRs []string:CIDR表示法中的列表字符串,用于指定非伪装范围。
# masqLinkLocal bool:是否伪装流量169.254.0.0/16。默认为False。
# resyncInterval string:代理尝试从磁盘重新加载配置的时间间隔。语法是Go的time.ParseDuration函数接受的任何格式。

vim agent-config.yaml
nonMasqueradeCIDRs:
  - 10.0.0.0/8
  - 172.16.0.0/12
  - 192.168.0.0/16
masqLinkLocal: false
resyncInterval: 60s

# 部署ip-masq-agent
[root@sh-saas-cvmk8s-master-01 ip-masq-agent]# kubectl create -f ip-masq-agent.yaml 
daemonset.extensions "ip-masq-agent" created

[root@sh-saas-cvmk8s-master-01 ip-masq-agent]# kubectl create configmap ip-masq-agent --from-file=agent-config.yaml --namespace=kube-system
configmap "ip-masq-agent" created
# 部署ip-masq-agent前
[root@sh-saas-cvmk8s-node-01 ~]# iptables -t nat -vxnL POSTROUTING
Chain POSTROUTING (policy ACCEPT 19 packets, 1195 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
     6746 KUBE-POSTROUTING  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
        0 MASQUERADE  all  --  *      cbr0    127.0.0.0/8          0.0.0.0/0            /* SNAT for localhost access to hostports */ *

# 部署ip-masq-agent后
[root@sh-saas-cvmk8s-node-01 ~]# iptables -t nat -vxnL IP-MASQ-AGENT
Chain IP-MASQ-AGENT (1 references)
    pkts      bytes target     prot opt in     out     source               destination         
      936 RETURN     all  --  *      *       0.0.0.0/0            169.254.0.0/16       /* ip-masq-agent: local traffic is not subject to MASQUERADE */
      468 RETURN     all  --  *      *       0.0.0.0/0            10.0.0.0/8           /* ip-masq-agent: local traffic is not subject to MASQUERADE */
        0 RETURN     all  --  *      *       0.0.0.0/0            172.16.0.0/12        /* ip-masq-agent: local traffic is not subject to MASQUERADE */
        0 RETURN     all  --  *      *       0.0.0.0/0            192.168.0.0/16       /* ip-masq-agent: local traffic is not subject to MASQUERADE */
     1350 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */ 

7.2 打通容器网络与公司网络

打通容器网络与公司网络需要腾讯云将容器网段的路由通过BGP发布出来,但是目前腾讯云并未产品化这个功能。通过提供如下信息:

地域 账号appId vpcId字符串唯一id(vpc开头) 容器子网(子网掩码) 专线网关字符串唯一id(dcg开头)

请腾讯云后台开通。加完之后测试没有问题。

#从公司到容器网段
[root@bs-ops-test-docker-dev-01 ~]# tracepath 10.254.4.6
?: [LOCALHOST]                                         pmtu 1500
:  gateway                                               3.262ms 
:  gateway                                               2.137ms 
:  169.254.1.65                                          1.508ms 
:  10.200.9.197                                          1.906ms asymm  4 
:  10.12.96.4                                            2.417ms 
:  10.254.4.6                                            2.575ms reached
     Resume: pmtu 1500 hops 5 back 4 
[root@bs-ops-test-docker-dev-01 ~]# ping 10.254.4.6
PING 10.254.4.6 (10.254.4.6) 56(84) bytes of data.
 bytes from 10.254.4.6: icmp_seq=1 ttl=252 time=2.31 ms
 bytes from 10.254.4.6: icmp_seq=2 ttl=252 time=2.20 ms
^C
--- 10.254.4.6 ping statistics ---
 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 2.203/2.260/2.318/0.074 ms
[root@bs-ops-test-docker-dev-01 ~]# 

#从容器网段到公司
[root@sh-saas-cvmk8s-master-01 ip-masq-agent]# kubectl exec -it elasticsearch-logging-1  -n=kube-system -- sh
sh-4.2# ping 172.21.251.111
PING 172.21.251.111 (172.21.251.111) 56(84) bytes of data.
 bytes from 172.21.251.111: icmp_seq=1 ttl=60 time=2.74 ms
 bytes from 172.21.251.111: icmp_seq=2 ttl=60 time=2.62 ms
 bytes from 172.21.251.111: icmp_seq=3 ttl=60 time=2.64 ms
^C
--- 172.21.251.111 ping statistics ---
 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 2.624/2.670/2.746/0.054 ms
sh-4.2# exit

8. 参考:

8.1 master节点名问题

https://github.com/kubernetes/kubeadm/issues/584

8.2 kubeadm配置文件

https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm#MasterConfiguration
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/
https://kubernetes.io/docs/setup/independent/control-plane-flags/

8.3 Taints and Tolerations

https://k8smeetup.github.io/docs/concepts/configuration/taint-and-toleration/
https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

允许master部署pod
[root@sh-saas-cvmk8s-master-02 k8s]# kubectl taint nodes 10.12.96.5 node-role.kubernetes.io/master-
node "10.12.96.5" untainted
禁止master部署pod
[root@sh-saas-cvmk8s-master-02 k8s]# kubectl taint nodes 10.12.96.5 node-role.kubernetes.io/master="":NoSchedule
node "10.12.96.5" tainted

8.4 tencentcloud-cloud-controller-manager

https://github.com/dbdd4us/tencentcloud-cloud-controller-manager
https://github.com/dbdd4us/tencentcloud-cloud-controller-manager/blob/master/README_zhCN.md
下面是google cloud controller的文档,可以参考:
https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/

8.5 VPC 内通过 keepalived 搭建高可用主备集群

https://cloud.tencent.com/document/product/215/5850

8.6 ip-masq-agent

http://madorn.com/kubernetes-non-masquerade-cidr.html#.W073NrgamUk
https://github.com/kubernetes-incubator/ip-masq-agent
https://rocdu.io/2018/03/%E7%AE%A1%E7%90%86pod%E7%9A%84nat%E7%AD%96%E7%95%A5/
https://kubernetes.io/docs/tasks/administer-cluster/ip-masq-agent/

作者:阿辉的博客
系统 网络 集群 数据库 分布式云计算等 研究
原文地址:腾讯云环境K8S安装及配置测试, 感谢原作者分享。

发表评论