Service Mesh - 使用 kubeadm 和 MetalLB 搭建 Kubernetes & Istio 环境

Read in English

假设

Role IP OS RAM CPU
Master 172.16.50.146  Ubuntu 20.04 4G 2
Node1 172.16.50.147 Ubuntu 20.04 4G 2
Node2 172.16.50.148 Ubuntu 20.04 4G 2

安装前

更改 hostname

  • Master

    1
    2
    3
    
    $ sudo vim hostname
    $ cat /etc/hostname
    master
    
  • Node1

    1
    2
    3
    
    $ sudo vim hostname
    $ cat /etc/hostname
    node1
    
  • Node2

    1
    2
    3
    
    $ sudo vim hostname
    $ cat /etc/hostname
    node2
    

注:直接使用 sudo hostname xxx 只能临时改变 hostname,重启机器后还是会变成旧的 hostname。因此,这里推荐大家直接修改 /etc/hostname永久改变 hostname

如果因为机器重启改变 hostname,而导致 kubelet.go:2268] node "xxx" not found,可以修改完 hostname 后使用 systemctl restart kubelet 重启解决问题。

验证每个节点的 MAC 地址和 product_uuid 是唯一的

1
2
$ ip link
$ sudo cat /sys/class/dmi/id/product_uuid

关闭每个节点的防火墙

1
$ sudo ufw disable

关闭每个节点的 swap

1
$ sudo swapoff -a; sudo sed -i '/swap/d' /etc/fstab

提示: 如果需要在多台计算机上同时执行命令,则可以使用 iTerm⌘+ Shift + i跨所有选项卡输入。

让 iptables 查看每个节点的桥接流量

加载 br_netfilter

1
2
3
4
5
6
7
8
$ sudo modprobe br_netfilter
$ lsmod | grep br_netfilter
br_netfilter           28672  0
bridge                176128  1 br_netfilter
$ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
> br_netfilter
> EOF
br_netfilter

设置 net.bridge.bridge-nf-call-iptables

1
2
3
4
5
6
7
8
9
10
11
12
$ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
> net.bridge.bridge-nf-call-ip6tables = 1
> net.bridge.bridge-nf-call-iptables = 1
> EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

$ sudo sysctl --system
* Applying /etc/sysctl.d/10-console-messages.conf ...
kernel.printk = 4 4 1 7
......
* Applying /etc/sysctl.conf ...

每个节点安装 Docker

根据 https://docs.docker.com/engine/install/ubuntu/https://docs.docker.com/engine/install/linux-postinstall/ 指导安装 Docker。

配置

配置 Docker 守护程序,尤其是使用 systemd 来管理容器的 cgroup

1
2
3
4
5
6
7
8
9
10
11
sudo mkdir /etc/docker
cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

重启 Docker 并在启动时启用:

1
2
3
sudo systemctl enable docker
sudo systemctl daemon-reload
sudo systemctl restart docker

每个节点安装 kubeadm, kubelet, kubectl

更新 apt 软件包索引并安装使用 Kubernetes apt 存储库所需的软件包:

1
2
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

下载 Google Cloud 公共签名密钥:

1
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg

添加 Kubernetes apt 存储库:

1
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list

更新 apt 软件包索引,安装 kubeletkubeadmkubectl,并固定其版本:

1
2
3
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

在 master 节点

初始化 kubernetes 集群

使用 masterIP 地址替换以下命令中的 172.16.50.146

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
$ sudo kubeadm init --apiserver-advertise-address=172.16.50.146 --pod-network-cidr=192.168.0.0/16  --ignore-preflight-errors=all
[init] Using Kubernetes version: v1.20.5
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 172.16.50.146]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master] and IPs [172.16.50.146 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master] and IPs [172.16.50.146 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 82.502493 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: en2kq9.2basuxxemkuv1yvu
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.16.50.146:6443 --token en2kq9.2basuxxemkuv1yvu \
    --discovery-token-ca-cert-hash sha256:97e84ca61b5d888476f5cdfd36fa141eaf2631e78e7d32c8c3d209e54be72870

配置 kubectl

1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

部署 calico 网络

安装 Tigera Calico operator 和自定义资源定义。

1
kubectl create -f https://docs.projectcalico.org/manifests/tigera-operator.yaml

通过创建必要的自定义资源来安装 Calico。有关此清单中可用配置选项的更多信息,请参见安装参考

1
kubectl create -f https://docs.projectcalico.org/manifests/custom-resources.yaml

: 在创建此清单之前,请阅读其内容并确保其设置适合你的环境。例如,你可能需要更改默认 IPCIDR 以匹配你的 Pod 网络 CIDR

使用以下命令确认所有 Pod 都在运行。

1
kubectl get pods -n calico-system -w

等到每个 Pod 都正常运行。

: Tigera operator 将资源安装在 calico-system 名称空间中。其他安装方法可以改用 kube-system 命名空间。

移除 master 节点上的 taints,以便可以在其上安排 pod

1
kubectl taint nodes --all node-role.kubernetes.io/master-

应该返回以下内容。

1
node/master untainted

确认集群中现在有一个节点。 它应该返回类似以下的内容。

1
2
3
$ kubectl get nodes -o wide
NAME     STATUS   ROLES                  AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
master   Ready    control-plane,master   8m57s   v1.20.5   172.16.50.146   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   docker://20.10.5

在节点上

加入集群

在每个节点中运行 join 命令(初始化 Kubernetes 集群输出中可以找到)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ sudo kubeadm join 172.16.50.146:6443 --token en2kq9.2basuxxemkuv1yvu \
>     --discovery-token-ca-cert-hash sha256:97e84ca61b5d888476f5cdfd36fa141eaf2631e78e7d32c8c3d209e54be72870
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

如果 token 已过期,则可以从 master 节点创建新的 token

1
kubeadm token create --print-join-command

验证集群

1
2
3
4
5
$ kubectl get node
NAME     STATUS   ROLES                  AGE     VERSION
master   Ready    control-plane,master   19m     v1.20.5
node1    Ready    <none>                 5m48s   v1.20.5
node2    Ready    <none>                 4m57s   v1.20.5

使用其他机器控制集群

在 master 节点上复制 admin.conf$HOME 目录下。

1
2
$ sudo cp /etc/kubernetes/admin.conf $HOME
$ sudo chown {user} /home/{user}/admin.conf

Scp $HOME/admin.conf 到其他机器。

1
2
3
4
5
6
7
$ scp {user}@172.16.50.146:/home/{user}/admin.conf .

$ kubectl --kubeconfig ./admin.conf get nodes
NAME     STATUS   ROLES                  AGE   VERSION
master   Ready    control-plane,master   45m   v1.20.5
node1    Ready    <none>                 31m   v1.20.5
node2    Ready    <none>                 31m   v1.20.5

Metric Server

部署 Metric Server 的目的是使用 top 命令查看简单的指标。

在没有 Metric server 的情况下,使用 top 命令将会返回错误。

1
2
$ kubectl top node
error: Metrics API not available

安装

1
2
3
4
5
6
7
8
9
10
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

配置

1
$ kubectl edit deploy -n kube-system metrics-server

这将打开一个部署文件的文本编辑器,你需要做以下修改。

spec.template.spec.containers 增加如下参数:

1
- --kubelet-insecure-tls

修改后,部署文件应该大致如下:

1
2
3
4
5
6
7
8
9
containers:
  - args:
      - --cert-dir=/tmp
      - --secure-port=4443
      # Add this line
      - --kubelet-insecure-tls
      - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      - --kubelet-use-node-status-port
    image: k8s.gcr.io/metrics-server/metrics-server:v0.4.2

等待 metrics-server 状态更新为 Running

1
2
3
4
5
$ kubectl get pod -n kube-system -w
NAME                              READY   STATUS              RESTARTS   AGE
...
metrics-server-76f8d9fc69-jb94v   0/1     ContainerCreating   0          43s
metrics-server-76f8d9fc69-jb94v   1/1     Running             0          81s

再次使用 top 命令查看。

1
2
3
4
5
❯ kubectl top node
NAME     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
master   242m         12%    2153Mi          56%
node1    143m         7%     2158Mi          56%
node2    99m          4%     1665Mi          43%

Dashboard

安装

1
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml

使用以下命令确认所有 Pod 都在运行。

1
2
3
4
❯ kubectl get pod -n kubernetes-dashboard -w
NAME                                         READY   STATUS    RESTARTS   AGE
dashboard-metrics-scraper-79c5968bdc-w2gmc   1/1     Running   0          8m
kubernetes-dashboard-9f9799597-w9fbz         1/1     Running   0          8m

访问

要从本地站访问 Dashboard,必须创建一个通往 Kubernetes 集群的安全通道。运行以下命令:

1
kubectl proxy

然后通过 http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/ 访问 Dashboard。

创建示例用户

创建 Service Account

1
2
3
4
5
6
7
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
EOF

创建 ClusterRoleBinding

1
2
3
4
5
6
7
8
9
10
11
12
13
14
cat <<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard
EOF

创建 Bearer Token

1
$ kubectl -n kubernetes-dashboard get secret $(kubectl -n kubernetes-dashboard get sa/admin-user -o jsonpath="{.secrets[0].name}") -o go-template=""

现在,复制 token 并将其粘贴到登录屏幕上的输入 Token 字段中。

单击登录按钮,仅此而已。你现在已以管理员身份登录。

Istio

安装 istioctl

1
2
3
curl -L https://istio.io/downloadIstio | sh -
cd istio-1.9.2
export PATH=$PWD/bin:$PATH

部署 Istio operator

1
2
3
4
5
$ istioctl operator init
Installing operator controller in namespace: istio-operator using image: docker.io/istio/operator:1.9.2
Operator controller will watch namespaces: istio-system
✔ Istio operator installed
✔ Installation complete

安装 Istio

1
2
3
4
5
6
7
8
9
10
11
12
13
$ kubectl create ns istio-system
namespace/istio-system created

$ kubectl apply -f - <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  namespace: istio-system
  name: example-istiocontrolplane
spec:
  profile: default
EOF
istiooperator.install.istio.io/example-istiocontrolplane created

使用以下命令确认所有 Pod 都在运行。

1
2
3
4
$ kubectl get pod -n istio-system -w
NAME                                    READY   STATUS    RESTARTS   AGE
istio-ingressgateway-7cc49dcd99-c4mtf   1/1     Running   0          94s
istiod-687f965684-n8rkv                 1/1     Running   0          3m26s

MetalLB

Kubernetes 不为裸机集群提供网络负载均衡器的实现(LoadBalancer 类型的服务)。 Kubernetes 附带的 Network LB 的实现都是调用各种 IaaS 平台(GCP,AWS,Azure 等)的粘合代码。 如果你未在受支持的 IaaS 平台(GCP,AWS,Azure 等)上运行,则 LoadBalancers 在创建后将无限期保持 “pending” 状态。

MetalLB 可以解决 istio ingress gateway EXTERNAL-IP “pending” 的问题.

安装

1
2
3
4
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.6/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.6/manifests/metallb.yaml
# On first install only
kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - 172.16.50.147-172.16.50.148 #Update this with your Nodes IP range
EOF

实例

启用 istio 自动注入。

1
2
$ kubectl label namespace default istio-injection=enabled
namespace/default labeled

部署 book info 实例。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cd istio-1.9.2
$ kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml
service/details created
serviceaccount/bookinfo-details created
deployment.apps/details-v1 created
service/ratings created
serviceaccount/bookinfo-ratings created
deployment.apps/ratings-v1 created
service/reviews created
serviceaccount/bookinfo-reviews created
deployment.apps/reviews-v1 created
deployment.apps/reviews-v2 created
deployment.apps/reviews-v3 created
service/productpage created
serviceaccount/bookinfo-productpage created
deployment.apps/productpage-v1 created

部署 book info gateway

1
2
3
$ kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml
gateway.networking.istio.io/bookinfo-gateway created
virtualservice.networking.istio.io/bookinfo created

使用以下命令确认所有 Pod 都在运行。

1
2
3
4
5
6
7
8
$ kubectl get pod
NAME                              READY   STATUS    RESTARTS   AGE
details-v1-79f774bdb9-62x6b       2/2     Running   0          19m
productpage-v1-6b746f74dc-4g4hk   2/2     Running   0          19m
ratings-v1-b6994bb9-rz6pq         2/2     Running   0          19m
reviews-v1-545db77b95-bcnd8       2/2     Running   0          19m
reviews-v2-7bf8c9648f-zcgfx       2/2     Running   0          19m
reviews-v3-84779c7bbc-78bk7       2/2     Running   0          19m

获取 istio ingress gatewayEXTERNAL-IP

1
2
3
4
$ kubectl get service -n istio-system
NAME                   TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                                                      AGE
istio-ingressgateway   LoadBalancer   10.99.204.213   172.16.50.147   15021:32373/TCP,80:30588/TCP,443:31095/TCP,15012:31281/TCP,15443:32738/TCP   73m
istiod                 ClusterIP      10.103.238.79   <none>          15010/TCP,15012/TCP,443/TCP,15014/TCP                                        75m

http://EXTERNAL-IP/productpage 访问 productpage

参考


CatchZeng
Written by CatchZeng Follow
AI (Machine Learning) and DevOps enthusiast.