简单记录一下k8s安装过程,使用TLS Bootstrapping token鉴权。node Authorizer鉴权后期再写

J6rJgI.jpg

环境说明

由于条件有限,机器只有两台。一台master,一台node。

节点说明

主机名 ip 用途 系统版本
k8s-77-167 172.16.77.167 master Centos7.7
k8s-77-153 172.16.77.153 node Centos7.7

Kubernetes相关组件简解

master

  • API Server 主节点上负责提供 Kubernetes API 服务的组件;它是 Kubernetes 控制面的前端
  • Controller Manager 控制管理器。它运行着所有处理集群日常任务的控制器。包括了节点控制器、副本控制器、端点(endpoint)控制器以及服务账户和令牌控制器。每一个控制器都独立工作以维护其所需的状态。
  • Schedule 调度器。主节点上的组件,该组件监视那些新创建的未指定运行节点的 Pod,并选择节点让 Pod 在上面运行。

node

  • Kubelet 一个在集群中每个节点上运行的代理。它保证容器都运行在 Pod 中。

    kubelet 接收一组通过各类机制提供给它的 PodSpecs,确保这些 PodSpecs 中描述的容器处于运行状态且健康。kubelet 不会管理不是由 Kubernetes 创建的容器。

  • Kube-proxy 是集群中每个节点上运行的网络代理,实现 Kubernetes Service 概念的一部分。

    kube-proxy 维护节点上的网络规则。这些网络规则允许从集群内部或外部的网络会话与 Pod 进行网络通信。

    如果操作系统提供了数据包过滤层并可用的话,kube-proxy会通过它来实现网络规则。否则,kube-proxy 仅转发流量本身。

软件包版本

软件包 地址
kubernetes-server-linux-amd64.tar.gz https://dl.k8s.io/v1.17.5/kubernetes-server-linux-amd64.tar.gz
kubernetes-node-linux-amd64.tar.gz https://dl.k8s.io/v1.17.5/kubernetes-node-linux-amd64.tar.gz
flannel-v0.12.0-linux-amd64.tar.gz https://github.com/coreos/flannel/releases/download/v0.12.0/flannel-v0.12.0-linux-amd64.tar.gz
etcd-v3.4.7-linux-amd64.tar.gz https://github.com/etcd-io/etcd/releases/download/v3.4.7/etcd-v3.4.7-linux-amd64.tar.gz

系统初始化配置

以下操作需要在所有节点都进行一遍

关闭防火墙及SELINUX

1
2
3
4
$ systemctl stop firewalld && systemctl disable firewalld
$ setenforce 0
$ vim /etc/selinux/config
   SELINUX=disabled

关闭swap

1
2
$ swapoff -a 
$ sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

设置系统参数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ cat > /etc/sysctl.d/kubernetes.conf <<EOF
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
EOF

$ sysctl -p  /etc/sysctl.d/kubernetes.conf 

创建相关目录

1
2
3
4
5
6
# 相关文件及安装包临时存储目录
$ mkdir -p /opt/k8s/{install,ssl_config}

#创建安装目录
$ mkdir -p /etc/etcd/{config,ssl}
$ mkdir -p /etc/kubernetes/{config,ssl}

ssh免密设置

1
2
$ ssh-keygen -t rsa
$ ssh-copy-id 172.16.77.153

安装docker

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#卸载旧版本
$ yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine
# 配置存储库
$ yum install -y yum-utils
$ yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
# 安装特定版本
# 查看可安装的版本
$ yum list docker-ce --showduplicates | sort -r

# 安装版本 版本:18.09.6
$ yum install docker-ce-18.09.6 docker-ce-cli-18.09.6 containerd.io -y
# 启动docker
$ systemctl start docker && systemctl enable docker

master部署配置

安装cfssl

1
2
3
4
5
6
7
$ wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
$ wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
$ wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
$ chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
$ mv cfssl_linux-amd64 /usr/local/bin/cfssl
$ mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
$ mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo

下载软件包

1
2
3
4
5
6
$ cd /opt/k8s/install
$ wget --timestamping \
 https://github.com/etcd-io/etcd/releases/download/v3.4.7/etcd-v3.4.7-linux-amd64.tar.gz \
 https://github.com/coreos/flannel/releases/download/v0.12.0/flannel-v0.12.0-linux-amd64.tar.gz \
 https://dl.k8s.io/v1.17.5/kubernetes-server-linux-amd64.tar.gz \
 https://dl.k8s.io/v1.17.5/kubernetes-node-linux-amd64.tar.gz

创建etcd证书

这里Kubernetes组件和etcd使用一个CA证书,也可以单独创建CA证书

ca证书配置

ca配置
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ cd /opt/k8s/ssl_config
$ cat << EOF | tee ca-config.json
{
  "signing": {
    "default": {
      "expiry": "876000h"
    },
    "profiles": {
      "kubernetes": {
         "expiry": "876000h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF
  • config.json:可以定义多个profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个profile;
  • signing: 表示该证书可用于签名其它证书;生成的ca.pem 证书中CA=TRUE
  • server auth: 表示client 可以用该CA 对server 提供的证书进行校验;
  • client auth: 表示server 可以用该CA 对client 提供的证书进行验证。
ca证书签名请求
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ cat << EOF | tee ca-csr.json
{
    "CN": "kubernetes",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "QingDao",
            "ST": "QingDao",
            "O": "k8s",
            "OU": "System"
        }
    ],
     "ca": {
       "expiry": "876000h"
    }
}
EOF
  • CN: Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名(User Name);浏览器使用该字段验证网站是否合法;
  • O: Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组(Group);
生成ca证书和私钥
1
$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca

etcd证书配置

创建etcd证书签名请求
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ cat > etcd-csr.json <<EOF
{
  "CN": "etcd",
  "hosts": [
    "127.0.0.1",
    "172.16.77.167"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "QingDao",
      "L": "QingDao",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF
生成etcd证书
1
2
3
4
5
6
7
$ cfssl gencert -ca=ca.pem \
  -ca-key=ca-key.pem \
  -config=ca-config.json \
  -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
#转移证书
$ cp etcd*.pem /etc/etcd/ssl/
$ cp ca* /etc/kubernetes/ssl/

部署etcd

配置软件包

1
2
3
4
$ cd /opt/k8s/install
$ tar -zxvf etcd-v3.4.7-linux-amd64.tar.gz
$ cd etcd-v3.4.7-linux-amd64/
$ cp etcd etcdctl /usr/local/bin/

创建etcd的systemd unit文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
$  mkdir -p /var/lib/etcd  # 必须要先创建工作目录
$  cat > etcd.service <<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/local/bin/etcd \\
  --name=etcd1 \\
  --enable-v2=true \\
  --cert-file=/etc/etcd/ssl/etcd.pem \\
  --key-file=/etc/etcd/ssl/etcd-key.pem \\
  --peer-cert-file=/etc/etcd/ssl/etcd.pem \\
  --peer-key-file=/etc/etcd/ssl/etcd-key.pem \\
  --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\
  --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\
  --initial-advertise-peer-urls=https://172.16.77.167:2380 \\
  --listen-peer-urls=https://172.16.77.167:2380 \\
  --listen-client-urls=https://172.16.77.167:2379,http://127.0.0.1:2379 \\
  --advertise-client-urls=https://172.16.77.167:2379 \\
  --initial-cluster-token=etcd-cluster \\
  --initial-cluster=etcd1=https://172.16.77.167:2380 \\
  --initial-cluster-state=new \\
  --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
  • 指定etcd的工作目录和数据目录为/var/lib/etcd,需要在启动服务前创建这个目录;
  • --initial-cluster-state值为new时,--name的参数值必须位于--initial-cluster列表中;

启动etcd服务

1
2
3
4
5
$  mv etcd.service /etc/systemd/system/
$  systemctl daemon-reload
$  systemctl enable etcd
$  systemctl start etcd
$  systemctl status etcd

验证服务

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# 注意,我这里的etcd版本是最新的3.4.7 新版本的etcd的api默认是v3 3.3.x默认的api版本是v2版本。使用etcdctl --help确认etcd api版本
$ etcdctl --help  #确认api版本
# v2版本 执行
$ ETCDCTL_API=2 etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379"  cluster-health
# v3版本 执行
$ etcdctl  --cacert=/etc/kubernetes/ssl/ca.pem \
  --cert=/etc/etcd/ssl/etcd.pem \
  --key=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379"  endpoint health
 https://172.16.77.167:2379 is healthy: successfully committed proposal: took = 7.825088ms
$ etcdctl  --cacert=/etc/kubernetes/ssl/ca.pem \
  --cert=/etc/etcd/ssl/etcd.pem \
  --key=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379"  endpoint status
 https://172.16.77.167:2379, 71f7f677733ef3fa, 3.4.7, 16 kB, true, false, 4, 8, 8,

部署flannel

向etcd写入集群pod网段信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# v2版本 执行
$ ETCDCTL_API=2 etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
  set /coreos.com/network/config  \
 '{ "Network": "172.18.0.0/16", "Backend": {"Type": "vxlan"}}'
 
# v3版本 执行
$ etcdctl  --cacert=/etc/kubernetes/ssl/ca.pem \
  --cert=/etc/etcd/ssl/etcd.pem \
  --key=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
 put /coreos.com/network/config  \
 '{ "Network": "172.18.0.0/16", "Backend": {"Type": "vxlan"}}'
 
#查看
$ etcdctl  --cacert=/etc/kubernetes/ssl/ca.pem \
  --cert=/etc/etcd/ssl/etcd.pem \
  --key=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
 get  /coreos.com/network/config 

注意:写入的pod网段必须与kube-controller-manager--cluster-cidr 选项值一致

配置软件包

1
2
3
$ cd /opt/k8s/install/
$ tar -zxvf flannel-v0.12.0-linux-amd64.tar.gz
$ cp flanneld mk-docker-opts.sh /usr/local/bin/

创建flannel的systemd unit文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
cat > flanneld.service << EOF
[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service


[Service]
Type=notify
ExecStart=/usr/local/bin/flanneld \
  -etcd-cafile=/etc/kubernetes/ssl/ca.pem \
  -etcd-certfile=/etc/etcd/ssl/etcd.pem \
  -etcd-keyfile=/etc/etcd/ssl/etcd-key.pem \
  -etcd-endpoints=https://172.16.77.167:2379 \
  -etcd-prefix=/coreos.com/network
ExecStartPost=/usr/local/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
EOF
  • mk-docker-opts.sh脚本将分配给flanneld 的Pod 子网网段信息写入到/run/flannel/docker 文件中,后续docker 启动时使用这个文件中的参数值为 docker0 网桥
  • flanneld 使用系统缺省路由所在的接口和其他节点通信,对于有多个网络接口的机器(内网和公网),可以用 --iface 选项值指定通信接口(上面的 systemd unit 文件没指定这个选项)

启动flannel

注意:这里有个大坑

1
2
3
4
5
$ cp flanneld.service /etc/systemd/system/
$ systemctl daemon-reload
$ systemctl enable flanneld
$ systemctl start flanneld
$ systemctl status flanneld

执行到这里发现flannel没有启动起来。有以下报错

1
2
Apr 18 15:20:45 k8s-77-167 flanneld[3932]: E0418 15:20:45.227344    3932 main.go:386] Couldn't fetch network config: client: response is invalid json. The endpoint is probably not valid etcd cluster endpoint.
Apr 18 15:20:46 k8s-77-167 flanneld[3932]: timed out

查询文档发现即使是目前最新版本flannel0.12也不支持etcd v3(心里一万个***),而etcd 3.4.x默认的是v3版本

解决办法
  • 方法一

    重新安装etcd。版本选择在3.3.10 因为3.3.10默认是v2版本(只要是默认是v2版本就行)

  • 方法二

    1、将etcd以api v2版本启动, 在启动参数里加上--enable-v2=true参数。重启etcd

    2、以v2版本重新向etcd写入集群pod网段信息。 然后再启动flannel

1
2
3
4
5
6
$ ETCDCTL_API=2  etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
  set /coreos.com/network/config  \
 '{ "Network": "172.18.0.0/16", "Backend": {"Type": "vxlan"}}'

验证flannel服务

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#查看flannel
$ ifconfig flannel.1

# 检查分配给各flanneld 的Pod 网段信息
# 查看pod网段(/16)
$ ETCDCTL_API=2 etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
  get /coreos.com/network/config  
  
 { "Network": "172.18.0.0/16", "Backend": {"Type": "vxlan"}} #此行是输出
 
 # 查看已分配的 Pod 子网段列表(/24)
$ ETCDCTL_API=2 etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
  ls /coreos.com/network/subnets 
  
 /coreos.com/network/subnets/172.18.41.0-24
 # 查看某一 Pod 网段对应的 flanneld 进程监听的 IP 和网络参数
$ ETCDCTL_API=2 etcdctl --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/etcd/ssl/etcd.pem \
  --key-file=/etc/etcd/ssl/etcd-key.pem \
  --endpoints="https://172.16.77.167:2379" \
  get /coreos.com/network/subnets/172.18.41.0-24
  
 {"PublicIP":"172.16.77.167","BackendType":"vxlan","BackendData":{"VtepMAC":"5e:cc:d2:9d:09:d9"}}

配置docker启动指定子网段

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ vim /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket

[Service]
Type=notify
EnvironmentFile=/run/flannel/docker #增加这行
ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS #设置启动参数$DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3

StartLimitInterval=60s

LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

TasksMax=infinity

Delegate=yes

KillMode=process

[Install]
WantedBy=multi-user.target
  • flanneld 启动时将网络配置写入到 /run/flannel/docker 文件中的变量 DOCKER_NETWORK_OPTIONS,dockerd 命令行上指定该变量值来设置 docker0 网桥参数
  • 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数)

重启docker

1
2
3
$  systemctl daemon-reload
$  systemctl restart docker
$  systemctl status docker

将flannel相关文件都拷贝到node节点

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ scp -r /etc/kubernetes/ssl 172.16.77.153:/etc/kubernetes
$ scp -r /etc/etcd/ssl 172.16.77.153:/etc/etcd
$ scp /usr/lib/systemd/system/docker.service 172.16.77.153:/usr/lib/systemd/system/docker.service
$ scp /etc/systemd/system/flanneld.service 172.16.77.153:/etc/systemd/system/flanneld.service
$ scp /usr/local/bin/{flanneld,mk-docker-opts.sh} 172.16.77.153:/usr/local/bin/

#node节点执行
$ systemctl daemon-reload
$ systemctl start flanneld
$ systemctl enable flanneld
$ systemctl restart docker

验证flannel网络配置

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:90:be:3b brd ff:ff:ff:ff:ff:ff
    inet 172.16.77.167/24 brd 172.16.77.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:ac:c3:5b:f4 brd ff:ff:ff:ff:ff:ff
    inet 172.18.41.1/24 brd 172.18.41.255 scope global docker0
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 5e:cc:d2:9d:09:d9 brd ff:ff:ff:ff:ff:ff
    inet 172.18.41.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
$ cat /run/flannel/docker
DOCKER_OPT_BIP="--bip=172.18.41.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=true"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_NETWORK_OPTIONS=" --bip=172.18.41.1/24 --ip-masq=true --mtu=1450"

$ cat /run/flannel/subnet.env
FLANNEL_NETWORK=172.18.0.0/16
FLANNEL_SUBNET=172.18.41.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false

部署master相关组件

创建kubernetes证书

  • 创建kube-apiserver证书签名请求
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ cd /opt/k8s/ssl_config
$ cat > kubernetes-csr.json <<EOF
{
  "CN": "kubernetes",
  "hosts": [
    "127.0.0.1",
    "172.16.77.167",
    "172.16.77.153",
    "10.254.0.1",
    "k8s.test.io",
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local"
  ],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "QingDao",
      "L": "QingDao",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF
  • 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,所以上面分别指定了当前部署的 master 节点主机 IP 以及apiserver 负载的内部域名
  • 还需要添加 kube-apiserver 注册的名为 kubernetes 的服务 IP (Service Cluster IP),一般是 kube-apiserver --service-cluster-ip-range 选项值指定的网段的第一个IP,如 “10.254.0.1”

生成Kubernetes证书和私钥

1
2
3
4
5
$  cfssl gencert -ca=ca.pem \
  -ca-key=ca-key.pem \
  -config=ca-config.json \
  -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
$ cp kubernetes*.pem /etc/kubernetes/ssl/

创建 TLS Bootstrapping Token

1
2
3
4
5
6
$ head -c 16 /dev/urandom | od -An -t x | tr -d ' '
794f6f841f297b6c6ec775636f73544f

$ vim /etc/kubernetes/config/token.csv
794f6f841f297b6c6ec775636f73544f,kubelet-bootstrap,10001,"system:kubelet-bootstrap"
# 依次是 随机32位字符串 用户名 UID 用户组

配置软件包

1
2
3
4
5
$ cd /opt/k8s/install/
$ tar -zxvf kubernetes-server-linux-amd64.tar.gz
$ cd kubernetes/server/bin
$ cp kube-apiserver kube-controller-manager kubectl kubelet kube-proxy kube-scheduler /usr/local/bin/
$ scp kubectl kubelet kube-proxy 172.16.77.153:/usr/local/bin/ 

创建kube-apiserver 的systemd unit文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
$ cat  > kube-apiserver.service <<EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target

[Service]
ExecStart=/usr/local/bin/kube-apiserver \\
  --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota,NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook \\
  --advertise-address=172.16.77.167 \\
  --bind-address=172.16.77.167 \\
  --secure-port=6443 \\
  --insecure-bind-address=127.0.0.1 \\
  --authorization-mode=Node,RBAC \\
  --kubelet-https=true \\
  --enable-bootstrap-token-auth \\
  --token-auth-file=/etc/kubernetes/config/token.csv \\
  --service-cluster-ip-range=10.254.0.0/16 \\
  --service-node-port-range=20000-40000 \\
  --tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\
  --tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\
  --client-ca-file=/etc/kubernetes/ssl/ca.pem \\
  --service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \\
  --etcd-cafile=/etc/kubernetes/ssl/ca.pem \\
  --etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem \\
  --etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem \\
  --etcd-servers=https://172.16.77.167:2379 \\
  --enable-swagger-ui=true \\
  --allow-privileged=true \\
  --audit-log-maxage=30 \\
  --audit-log-maxbackup=3 \\
  --audit-log-maxsize=100 \\
  --audit-log-path=/var/lib/audit.log \\
  --event-ttl=1h \\
  --logtostderr=true \\
  --v=6
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
  • --service-cluster-ip-range 指定 Service Cluster IP 地址段,该地址段不能路由可达

启动服务

1
2
3
4
5
$  cp kube-apiserver.service /etc/systemd/system/
$  systemctl daemon-reload
$  systemctl enable kube-apiserver
$  systemctl start kube-apiserver
$  systemctl status kube-apiserver

创建kube-controller-manager的systemd unit文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$  cat > kube-controller-manager.service <<EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-controller-manager \\
  --address=127.0.0.1 \\
  --master=http://127.0.0.1:8080 \\
  --allocate-node-cidrs=true \\
  --service-cluster-ip-range=10.254.0.0/16 \\
  --cluster-cidr=172.18.0.0/16 \\
  --cluster-name=kubernetes \\
  --cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \\
  --cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \\
  --service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \\
  --root-ca-file=/etc/kubernetes/ssl/ca.pem \\
  --leader-elect=true \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF
  • --address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器

  • --cluster-cidr 指定 Cluster 中 Pod 的 CIDR 范围,该网段在各 Node 间必须路由可达(flanneld保证)

  • --service-cluster-ip-range 参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致

  • --cluster-signing-* 指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥

  • --root-ca-file 用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件

  • --leader-elect=true 部署多台机器组成的 master 集群时选举产生一处于工作状态的 kube-controller-manager 进程

启动kube-controller-manager

1
2
3
4
5
$  cp kube-controller-manager.service /etc/systemd/system/
$  systemctl daemon-reload
$  systemctl enable kube-controller-manager
$  systemctl start kube-controller-manager
$  systemctl status kube-controller-manager

创建kube-scheduler 的systemd unit文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ cat > kube-scheduler.service <<EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-scheduler \\
  --address=127.0.0.1 \\
  --master=http://127.0.0.1:8080 \\
  --leader-elect=true \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

启动kube-scheduler

1
2
3
4
5
$  cp kube-scheduler.service /etc/systemd/system/
$  systemctl daemon-reload
$  systemctl enable kube-scheduler
$  systemctl start kube-scheduler
$  systemctl status kube-scheduler

创建admin证书

创建admin证书签名请求
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
$ cd /opt/k8s/ssl_config
$ cat > admin-csr.json <<EOF
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "QingDao",
      "L": "QingDao",
      "O": "system:masters",
      "OU": "System"
    }
  ]
}
EOF
  • admin.pem证书O 字段值为system:masterskube-apiserver 预定义的 RoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 相关 API 的权限
生成admin证书和私钥
1
2
3
4
5
$ cfssl gencert -ca=ca.pem \
  -ca-key=ca-key.pem \
  -config=ca-config.json \
  -profile=kubernetes admin-csr.json | cfssljson -bare admin
$ cp admin*.pem /etc/kubernetes/ssl/
创建kubectl kubeconfig文件
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ cd /etc/kubernetes/config
# 设置集群参数
$ kubectl config set-cluster kubernetes  \
 --certificate-authority=/etc/kubernetes/ssl/ca.pem \
 --embed-certs=true   \
 --server=https://172.16.77.167:6443   \
 --kubeconfig=admin.config
# 设置客户端参数
$ kubectl config set-credentials admin   \
 --client-certificate=/etc/kubernetes/ssl/admin.pem   \
 --client-key=/etc/kubernetes/ssl/admin-key.pem   \
 --embed-certs=true   \
 --kubeconfig=admin.config
# 设置上下文参数
$ kubectl config set-context default   \
 --cluster=kubernetes   \
 --user=admin   \
 --kubeconfig=admin.config
# 设置默认上下文
$ kubectl config use-context default --kubeconfig=admin.config

### 注意如果上面的所有命令都不带--kubeconfig=admin.config的话,生成的kubeconfig会被直接保存为~/.kube/config
# 转移kubeconfig文件 也要将此文件拷贝到运行了kubectl命令的机器的~/.kube/目录中去
$ mkdir ~/.kube
$ cp admin.config /root/.kube/config

验证master节点

1
2
3
4
5
$ kubectl get componentstatuses
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-0               Healthy   {"health":"true"}

node组件配置

由于只在master上安装cfssl。所以node节点的相关证书、config文件先在master生成,然后在拷贝到node上

创建kube-proxy证书

创建kube-proxy证书签名请求
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
$ cd /opt/k8s/ssl_config
$ cat << EOF | tee kube-proxy-csr.json
{
  "CN": "system:kube-proxy",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "QingDao",
      "ST": "QingDao",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF
  • kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限
  • CN 指定该证书的 User 为 system:kube-proxy
生成kube-proxy证书和私钥
1
2
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy 
$ scp kube-proxy*.pem 172.16.77.153:/etc/kubernetes/ssl/

创建 kubelet bootstrap.kubeconfig 文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ cd /opt/k8s/ssl_config
# 设置集群参数
$ kubectl config set-cluster kubernetes \
  --certificate-authority=./ca.pem \
  --embed-certs=true \
  --server=https://172.16.77.167:6443 \
  --kubeconfig=bootstrap.kubeconfig
# 设置客户端认证参数
$ kubectl config set-credentials kubelet-bootstrap \
  --token=794f6f841f297b6c6ec775636f73544f \
  --kubeconfig=bootstrap.kubeconfig
# 设置上下文参数
$ kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet-bootstrap \
  --kubeconfig=bootstrap.kubeconfig
# 设置默认上下文

$ kubectl config use-context default --kubeconfig=bootstrap.kubeconfig

#转移bootstrap.kubeconfig文件
$ scp bootstrap.kubeconfig 172.16.77.153:/etc/kubernetes/config/

创建kubelet kubeconfig文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# 设置集群参数
$ kubectl config set-cluster kubernetes \
  --certificate-authority=./ca.pem \
  --embed-certs=true \
  --server=https://172.16.77.167:6443 \
  --kubeconfig=kubelet.kubeconfig
# 设置客户端认证参数
$ kubectl config set-credentials kubelet \
  --token=794f6f841f297b6c6ec775636f73544f \
  --kubeconfig=kubelet.kubeconfig
# 设置上下文参数
$ kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet \
  --kubeconfig=kubelet.kubeconfig
# 设置默认上下文
$ kubectl config use-context default --kubeconfig=kubelet.kubeconfig

#转移kubelet.kubeconfig文件
$ scp kubelet.kubeconfig 172.16.77.153:/etc/kubernetes/config/

创建kube-proxy kubeconfig文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 设置集群参数
$ kubectl config set-cluster kubernetes \
  --certificate-authority=./ca.pem \
  --embed-certs=true \
  --server=https://172.16.77.167:6443 \
  --kubeconfig=kube-proxy.kubeconfig
# 设置客户端认证参数
$ kubectl config set-credentials kube-proxy \
  --client-certificate=./kube-proxy.pem \
  --client-key=./kube-proxy-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig
# 设置上下文参数
$ kubectl config set-context default \
  --cluster=kubernetes \
  --user=kube-proxy \
  --kubeconfig=kube-proxy.kubeconfig
# 设置默认上下文
$ kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

#转移kube-proxy.kubeconfig文件
$ scp kube-proxy.kubeconfig 172.16.77.153:/etc/kubernetes/config/

node部署配置

安装配置kubelet

将kubelet-bootstrap用户绑定到系统集群角色

1
$ kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap

创建kubelet 的systemd unit 文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ mkdir /var/lib/kubelet #创建工作目录
$ cat > kubelet.service <<EOF
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/usr/local/bin/kubelet \\
  --fail-swap-on=false \\
  --cgroup-driver=cgroupfs \\
  --address=172.16.77.153 \\
  --hostname-override=172.16.77.153 \\
  --bootstrap-kubeconfig=/etc/kubernetes/config/bootstrap.kubeconfig \\
  --kubeconfig=/etc/kubernetes/config/kubelet.kubeconfig \\
  --cert-dir=/etc/kubernetes/ssl \\
  --cluster-dns=10.254.0.2 \\
  --cluster-domain=cluster.local.\\
  --hairpin-mode promiscuous-bridge \\
  --serialize-image-pulls=false \\
  --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0 \\
  --logtostderr=true \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF
  • --cgroup-driver参数,kubelet 用来维护主机的的 cgroups 的,默认是cgroupfs,但是这个地方的值需要你根据docker 的配置来确定(docker info |grep cgroup
  • -address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1指向自己而不是 kubelet
  • 如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况
  • --experimental-bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求
  • 管理员通过了 CSR 请求后,kubelet 自动在 --cert-dir 目录创建证书和私钥文件(kubelet-client.crtkubelet-client.key),然后写入 --kubeconfig 文件(自动创建 --kubeconfig 指定的文件)
  • 建议在 --kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 --api-servers 选项,则必须指定 --require-kubeconfig 选项后才从配置文件中读取 kue-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息
  • --cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),--cluster-domain 指定域名后缀,这两个参数同时指定后才会生效

启动kubelet

1
2
3
4
5
$ cp kubelet.service /etc/systemd/system/kubelet.service
$ systemctl daemon-reload
$ systemctl enable kubelet
$ systemctl start kubelet
$ systemctl status kubelet

通过kubelet的证书请求

1
2
3
4
5
6
7
8
$ kubectl get csr
NAME                                                   AGE    REQUESTOR           CONDITION
node-csr-akY5FFofACgSYjgh-5q-Yk9PHrMK53y6duZLy8-XjfU   106s   kubelet-bootstrap   Pending
$ kubectl certificate approve node-csr-akY5FFofACgSYjgh-5q-Yk9PHrMK53y6duZLy8-XjfU
certificatesigningrequest.certificates.k8s.io/node-csr-akY5FFofACgSYjgh-5q-Yk9PHrMK53y6duZLy8-XjfU approved
$ kubectl get csr
NAME                                                   AGE    REQUESTOR           CONDITION
node-csr-akY5FFofACgSYjgh-5q-Yk9PHrMK53y6duZLy8-XjfU   2m6s   kubelet-bootstrap   Approved,Issued

安装配置kube-proxy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ mkdir -p /var/lib/kube-proxy #创建工作目录
$  cat > kube-proxy.service <<EOF
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target

[Service]
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \\
  --bind-address=172.16.77.153 \\
  --hostname-override=172.16.77.153 \\
  --cluster-cidr=10.254.0.0/16 \\
  --kubeconfig=/etc/kubernetes/config/kube-proxy.kubeconfig \\
  --logtostderr=true \\
  --v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF
  • --hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则
  • --cluster-cidr 必须与 kube-apiserver 的 --service-cluster-ip-range 选项值一致
  • kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr--masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT

启动kube-proxy

1
2
3
4
5
$ cp kube-proxy.service /etc/systemd/system/
$ systemctl daemon-reload
$ systemctl enable kube-proxy
$ systemctl start kube-proxy
$ systemctl status kube-proxy

验证集群

  • nginx-deployment.yaml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.15.1
        ports:
        - containerPort: 80
          name: nginx
  • nginx-svc.yaml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
  - protocol: TCP
    nodePort: 33381
    port: 80
    targetPort: 80
    name: nginx
  type: NodePort
  • 创建
1
2
$ kubectl apply -f nginx-deployment.yaml
$ kubectl apply -f nginx-svc.yaml
  • 查看
1
2
3
4
5
6
7
$ kubectl get svc,pods
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes      ClusterIP   10.254.0.1      <none>        443/TCP        121m
service/nginx-service   NodePort    10.254.109.96   <none>        80:33381/TCP   9m14s

NAME                                    READY   STATUS    RESTARTS   AGE
pod/nginx-deployment-68884fc8f8-z9wkp   1/1     Running   0          9m21s

访问http://172.16.77.153:33381/,正常就可以访问到nginx欢迎页面

  • master节点无须启动kubelet和kube-proxy服务,如果想从master节点访问service,可以在master节点上启动kube-proxy服务,如果想让pod调度到master节点,可以启动kubelet服务。

配置kubectl命令补全

1
2
$ source <(kubectl completion bash)
$ echo "source <(kubectl completion bash)" >> ~/.bashrc

扩展一

安装coredns

下载的kubernetes-server-linux-amd64.tar.gz里包含

coredns配置安装

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ cd /opt/k8s/install/kubernetes/
$ tar -zxvf kubernetes-src.tar.gz
$ cd cluster/addons/dns/coredns/
$ ls
coredns.yaml  coredns.yaml.base  coredns.yaml.in  coredns.yaml.sed  Makefile  transforms2salt.sed  transforms2sed.sed
$ cat transforms2sed.sed
s/__PILLAR__DNS__SERVER__/$DNS_SERVER_IP/g
s/__PILLAR__DNS__DOMAIN__/$DNS_DOMAIN/g
s/__PILLAR__CLUSTER_CIDR__/$SERVICE_CLUSTER_IP_RANGE/g
s/__PILLAR__DNS__MEMORY__LIMIT__/$DNS_MEMORY_LIMIT/g
s/__MACHINE_GENERATED_WARNING__/Warning: This is a file generated from the base underscore template file: __SOURCE_FILENAME__/g
##将$DNS_SERVER_IP和$DNS_DOMAIN替换成kubelet配置的内容 参考kubelet 启动文件里的参数
## 将$DNS_SERVER_IP替换成10.254.0.2,将$DNS_DOMAIN替换成cluster.local.  $DNS_MEMORY_LIMIT改为170Mi

##修改完后执行下面的命令,生成部署coreDNS所需的coredns.yaml文件:
$ sed -f transforms2sed.sed coredns.yaml.base > coredns.yaml

# 将副本数改为2 将k8s.gcr.io/coredns:1.6.5替换成coredns/coredns:1.6.5
$ vim coredns.yaml
......
spec:
  replicas: 2 #增加这一行
.......
      - name: coredns
        image: coredns/coredns:1.6.5 #修改为coredns/coredns:1.6.5
.......
####部署coredns
$ kubectl apply -f coredns.yaml
$  kubectl get pods -n kube-system
NAME                       READY   STATUS    RESTARTS   AGE
coredns-6bddb9b7d7-bvznv   1/1     Running   0          2m19s
coredns-6bddb9b7d7-dpqrv   1/1     Running   0          2m19s

验证DNS功能

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#使用前面部署过的nginx进行测试
$ kubectl get pod,svc
NAME                                    READY   STATUS    RESTARTS   AGE
pod/nginx-deployment-68884fc8f8-p6pmh   1/1     Running   0          8m1s

NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes      ClusterIP   10.254.0.1      <none>        443/TCP        5h32m
service/nginx-service   NodePort    10.254.109.96   <none>        80:33381/TCP   3h39m
# 部署busybox,测试DNS功能
$ kubectl run -it --image=busybox:1.28.4 --rm --restart=Never sh
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes
Server:    10.254.0.2
Address 1: 10.254.0.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.254.0.1 kubernetes.default.svc.cluster.local
/ # nslookup nginx-service
Server:    10.254.0.2
Address 1: 10.254.0.2 kube-dns.kube-system.svc.cluster.local

Name:      nginx-service
Address 1: 10.254.109.96 nginx-service.default.svc.cluster.local
/ # nslookup nginx-service.default.svc.cluster.local
Server:    10.254.0.2
Address 1: 10.254.0.2 kube-dns.kube-system.svc.cluster.local

Name:      nginx-service.default.svc.cluster.local
Address 1: 10.254.109.96 nginx-service.default.svc.cluster.local
/ # cat /etc/resolv.conf
nameserver 10.254.0.2
search default.svc.cluster.local. svc.cluster.local. cluster.local.
options ndots:5

kubelet证书过期问题处理

注意:此方式安装的集群kubelet的证书是kubelet启动时发送CSR请求时产生的,由controler manager做实际签署的。默认时间是一年,时间过后会导致node不能与master正常通信

添加参数

  • 修改kubelet组件配置
1
2
3
4
5
# 在kubelet启动参数里加上以下参数
--feature-gates=RotateKubeletServerCertificate=true
--feature-gates=RotateKubeletClientCertificate=true
# 1.8版本以上包含1.8都支持证书更换自动重载,以下版本只能手动重启服务
--rotate-certificates=true
  • 修改kube-controller-manager配置参数
1
2
3
# 在kube-controller-manager启动参数里加上以下参数
--feature-gates=RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true
--experimental-cluster-signing-duration=876000h0m0s #此参数为设置签署的证书有效时间

创建自动批准相关CSR请求的ClusterRole

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ vim tls-instructs-csr.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: system:certificates.k8s.io:certificatesigningrequests:selfnodeserver
rules:
- apiGroups: ["certificates.k8s.io"]
  resources: ["certificatesigningrequests/selfnodeserver"]
  verbs: ["create"]
$ kubectl apply -f tls-instructs-csr.yaml
  • 自动批准 kubelet-bootstrap 用户 TLS bootstrapping 首次申请证书的 CSR 请求
1
$ kubectl create clusterrolebinding node-client-auto-approve-csr --clusterrole=system:certificates.k8s.io:certificatesigningrequests:nodeclient --user=kubelet-bootstrap
  • 自动批准 system:nodes 组用户更新 kubelet 自身与 apiserver 通讯证书的 CSR 请求
1
$ kubectl create clusterrolebinding node-client-auto-renew-crt --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeclient --group=system:nodes
  • 自动批准 system:nodes 组用户更新 kubelet 10250 api 端口证书的 CSR 请求
1
$ kubectl create clusterrolebinding node-server-auto-renew-crt --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeserver --group=system:nodes

重启kube-controller-manager 和 kubelet 服务

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ systemctl restart kube-controller-manager

# 删除kubelet证书
$ cd /etc/kubernetes/ssl
$ rm -rf kubelet*
# 重启启动
$ systemctl restart kubelet
#查看证书有效期
$ openssl x509 -in kubelet-client-current.pem -noout -text | grep "Not"
            Not Before: Apr 21 07:54:37 2020 GMT
            Not After : Mar 28 00:51:00 2120 GMT

扩展二

此方式只是和上面TLS Bootstrapping Token的方式不一样,结果是一样的。可以二选一操作

使用Bootstrap Token完成TSL Bootstrapping

创建Boostrap Token

1
$ echo "$(head -c 6 /dev/urandom | md5sum | head -c 6)"."$(head -c 16 /dev/urandom | md5sum | head -c 16)"
  • Token 必须满足 [a-z0-9]{6}\.[a-z0-9]{16} 格式;以 . 分割,前面的部分被称作 Token IDToken ID 并不是 “机密信息”,它可以暴露出去;相对的后面的部分称为 Token Secret,它应该是保密的

创建 Bootstrap Token Secret

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ vim bootstrap-token-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  # Name MUST be of form "bootstrap-token-<token id>"
  name: bootstrap-token-e40252
  namespace: kube-system

# Type MUST be 'bootstrap.kubernetes.io/token'
type: bootstrap.kubernetes.io/token
stringData:
  # Human readable description. Optional.
  description: "The default bootstrap token generated by 'kubeadm init'."

  # Token ID and secret. Required.
  token-id: e40252
  token-secret: 647feb2e573070b3

  # Expiration. Optional.
  expiration: 2020-04-24T00:00:11Z

  # Allowed usages.
  usage-bootstrap-authentication: "true"
  usage-bootstrap-signing: "true"

  # Extra groups to authenticate the token as. Must start with "system:bootstrappers:"
  auth-extra-groups: system:bootstrappers:worker,system:bootstrappers:ingress
  • 作为 Bootstrap Token Secret 的 type 必须为 bootstrap.kubernetes.io/token,name 必须为 bootstrap-token- (Token ID 就是上一步创建的 Token 前一部分)
  • usage-bootstrap-authenticationusage-bootstrap-signing 必须存才且设置为 true
  • expiration 字段是可选的,如果设置则 Secret 到期后将由 Controller Manager 中的 tokencleaner 自动清理
  • auth-extra-groups 也是可选的,令牌的扩展认证组,组必须以 system:bootstrappers: 开头
  • 官方文档

创建 ClusterRole 和 ClusterRoleBinding

创建ClusterRole
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ vim tls-instructs-csr.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: system:certificates.k8s.io:certificatesigningrequests:selfnodeserver
rules:
- apiGroups: ["certificates.k8s.io"]
  resources: ["certificatesigningrequests/selfnodeserver"]
  verbs: ["create"]
$ kubectl apply -f tls-instructs-csr.yaml
创建对应的ClusterRoleBinding

需要注意的是 在使用 Bootstrap Token 进行引导时,Kubelet 组件使用 Token 发起的请求其用户名为 system:bootstrap:,用户组为 system:bootstrappers;so 我们在创建 ClusterRoleBinding 时要绑定到这个用户或者组上;这里直接绑定在组上

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 允许 system:bootstrappers 组用户创建 CSR 请求
$ kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers

# 自动批准 system:bootstrappers 组用户 TLS bootstrapping 首次申请证书的 CSR 请求
$ kubectl create clusterrolebinding node-client-auto-approve-csr --clusterrole=system:certificates.k8s.io:certificatesigningrequests:nodeclient --group=system:bootstrappers

# 自动批准 system:nodes 组用户更新 kubelet 自身与 apiserver 通讯证书的 CSR 请求
$ kubectl create clusterrolebinding node-client-auto-renew-crt --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeclient --group=system:nodes

# 自动批准 system:nodes 组用户更新 kubelet 10250 api 端口证书的 CSR 请求
$ kubectl create clusterrolebinding node-server-auto-renew-crt --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeserver --group=system:nodes

调整kube-controller-manager

1
2
# 在启动参数上以下参数
--controllers=*,bootstrapsigner,tokencleaner

生成boostrap.kubeconfig

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# 设置集群参数
$ kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=https://172.16.77.167:6443 \
  --kubeconfig=bootstrap.kubeconfig
# 设置客户端认证参数
$ kubectl config set-credentials system:bootstrap:e40252 \
  --token=e40252.647feb2e573070b3 \
  --kubeconfig=bootstrap.kubeconfig
# 设置上下文参数
$ kubectl config set-context default \
  --cluster=kubernetes \
  --user=system:bootstrap:e40252 \
  --kubeconfig=bootstrap.kubeconfig
# 设置默认上下文
$ kubectl config use-context default --kubeconfig=bootstrap.kubeconfig

生成kubelet.kubeconfig

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# 设置集群参数
$ kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=https://172.16.77.167:6443 \
  --kubeconfig=kubelet.kubeconfig
# 设置客户端认证参数
$ kubectl config set-credentials kubelet \
  --token=e40252.647feb2e573070b3 \
  --kubeconfig=kubelet.kubeconfig
# 设置上下文参数
$ kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet \
  --kubeconfig=kubelet.kubeconfig
# 设置默认上下文
$ kubectl config use-context default --kubeconfig=kubelet.kubeconfig

启动kube-controller-manager,kubelet

1
2
3
$ systemctl daemon-reload
$ systemctl restart kube-controller-manager.service
$ systemctl restart kubelet.service

参考链接

阳明的博客

https://cloud.tencent.com/developer/article/1511862

https://blog.csdn.net/qq_24794401/article/details/103245796

https://mritd.me/2018/01/07/kubernetes-tls-bootstrapping-note/

https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/#bootstrap-token-secret-format

https://mritd.me/2018/08/28/kubernetes-tls-bootstrapping-with-bootstrap-token/