当前位置:首页 > 渗透测试 > 正文内容

Kubernetes入门到生产实践:从零搭建高可用集群(2026版)

admin2小时前渗透测试2

Kubernetes入门到生产实践:从零搭建高可用集群(2026版)

Kubernetes(简称K8s)已成为容器编排的事实标准。然而,从学习K8s到真正用于生产,中间有巨大的鸿沟。本文从零开始,带你搭建一个生产级别的K8s集群,涵盖安装、配置、应用部署、监控和排障全流程。


一、Kubernetes核心概念速览

概念说明类比
Pod最小部署单元,包含一个或多个容器虚拟机的一个进程
Deployment管理Pod的声明式更新自动伸缩组
Service稳定的网络端点,负载均衡反向代理
Ingress七层HTTP路由,域名访问入口Nginx配置
ConfigMap/Secret配置和敏感信息管理配置文件/密码本
PV/PVC持久化存储磁盘/挂载请求
Namespace资源隔离项目分区

二、环境准备:3节点集群规划

# 节点规划
+----------------+------------------+--------+
| 角色           | IP               | 配置   |
+----------------+------------------+--------+
| Control Plane  | 192.168.1.10     | 4C 8G  |
| Worker-1       | 192.168.1.11     | 8C 16G |
| Worker-2       | 192.168.1.12     | 8C 16G |
+----------------+------------------+--------+

# 所有节点基础配置
cat >> /etc/hosts << EOF
192.168.1.10 k8s-master
192.168.1.11 k8s-node1
192.168.1.12 k8s-node2
EOF

# 关闭防火墙和swap
systemctl stop firewalld && systemctl disable firewalld
swapoff -a && sed -i '/swap/s/^/#/' /etc/fstab

# 配置内核参数
cat >> /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
vm.swappiness                       = 0
EOF
sysctl --system

三、容器运行时:containerd

Kubernetes 1.24+已弃用Docker作为运行时,推荐使用containerd。

# 安装containerd(所有节点)
cat > /etc/yum.repos.d/docker.repo << EOF
[docker-ce-stable]
name=Docker CE Stable
baseurl=https://download.docker.com/linux/centos/9/x86_64/stable
enabled=1
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg
EOF

yum install containerd.io -y

# 生成默认配置
containerd config default > /etc/containerd/config.toml

# 关键配置:使用systemd cgroup驱动
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

# 配置镜像加速
cat >> /etc/containerd/config.toml << 'EOF'
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
    endpoint = ["https://docker.m.daocloud.io", "https://registry-1.docker.io"]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
    endpoint = ["https://k8s.m.daocloud.io"]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"]
    endpoint = ["https://k8s.m.daocloud.io", "https://registry.k8s.io"]
EOF

systemctl enable containerd && systemctl restart containerd

四、安装Kubernetes(kubeadm方式)

4.1 安装kubeadm/kubelet/kubectl(所有节点)

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el9-x86_64
enabled=1
gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

# 安装指定版本(推荐使用稳定版)
yum install -y kubelet-1.30.0 kubeadm-1.30.0 kubectl-1.30.0 --disableexcludes=kubernetes

# 锁定版本防止意外升级
yum install -y yum-plugin-versionlock
yum versionlock kubelet kubeadm kubectl

systemctl enable kubelet

4.2 初始化集群(Master节点)

# 配置文件 kubeadm-config.yaml
cat > kubeadm-config.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.1.10
  bindPort: 6443
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.30.0
controlPlaneEndpoint: 192.168.1.10:6443
networking:
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16
  dnsDomain: cluster.local
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF

# 初始化
kubeadm init --config=kubeadm-config.yaml --upload-certs

# 成功后会打印加入命令,保存下来
# 类似:
# kubeadm join 192.168.1.10:6443 --token xxxxx --discovery-token-ca-cert-hash sha256:xxxxx

# 配置kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

# 验证
kubectl get nodes
kubectl get pods -n kube-system

4.3 加入Worker节点

# 在Worker节点执行init时打印的join命令
kubeadm join 192.168.1.10:6443 --token xxxxx --discovery-token-ca-cert-hash sha256:xxxxx

# 如果token过期,在Master重新生成
kubeadm token create --print-join-command

# 在Master验证
kubectl get nodes
# 输出应显示所有节点 Ready 状态

五、CNI网络插件:Calico

# 安装Calico
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28/manifests/tigera-operator.yaml

cat > custom-resources.yaml << EOF
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  registry: quay.io/
  calicoNetwork:
    ipPools:
    - blockSize: 26
      cidr: 10.244.0.0/16
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()
EOF

kubectl create -f custom-resources.yaml

# 验证
kubectl get pods -n calico-system -w
kubectl get nodes  # 所有节点应变为 Ready

六、部署示例应用

6.1 Nginx + MySQL 经典架构

# 创建命名空间
kubectl create namespace production

# MySQL部署
cat > mysql-deploy.yaml << EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql
  namespace: production
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.4
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: root-password
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: mysql-storage
          mountPath: /var/lib/mysql
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2"
      volumes:
      - name: mysql-storage
        persistentVolumeClaim:
          claimName: mysql-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: mysql
  namespace: production
spec:
  selector:
    app: mysql
  ports:
  - port: 3306
    targetPort: 3306
EOF

# 创建Secret
kubectl create secret generic mysql-secret \
    --from-literal=root-password='K8s@2026!' \
    -n production

# 部署
kubectl apply -f mysql-deploy.yaml

6.2 Nginx + PHP应用

cat > nginx-app.yaml << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  namespace: production
spec:
  type: ClusterIP
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80
EOF

kubectl apply -f nginx-app.yaml

# 查看Pod
kubectl get pods -n production -w

6.3 Ingress配置(域名访问)

# 安装NGINX Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.10.0/deploy/static/provider/cloud/deploy.yaml

# 创建Ingress规则
cat > app-ingress.yaml << EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  namespace: production
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - app.yourdomain.com
    secretName: app-tls
  rules:
  - host: app.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: nginx
            port:
              number: 80
EOF

kubectl apply -f app-ingress.yaml

七、集群监控:Prometheus + Grafana

# 使用 kube-prometheus-stack(推荐)
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# 创建监控命名空间
kubectl create namespace monitoring

# 安装
cat > prometheus-values.yaml << EOF
grafana:
  adminPassword: Grafana@2026!
  ingress:
    enabled: true
    hosts:
      - grafana.yourdomain.com
  persistence:
    enabled: true
    size: 10Gi
    storageClassName: standard

prometheus:
  prometheusSpec:
    retention: 30d
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 100Gi
EOF

helm install prometheus-stack prometheus-community/kube-prometheus-stack \
    --namespace monitoring \
    --values prometheus-values.yaml

# 验证
kubectl get pods -n monitoring
kubectl get svc -n monitoring

# 访问Grafana
# 浏览器打开 http://grafana.yourdomain.com
# 用户: admin / Grafana@2026!
# 预置的Kubernetes集群监控面板可直接使用

八、日常运维命令

# 查看集群状态
kubectl cluster-info
kubectl get nodes -o wide
kubectl get pods --all-namespaces

# 查看资源使用
kubectl top nodes
kubectl top pods --all-namespaces

# 调试Pod
kubectl logs -f pod-name -n namespace
kubectl exec -it pod-name -n namespace -- /bin/bash
kubectl describe pod pod-name -n namespace

# 端口转发(调试用)
kubectl port-forward svc/nginx -n production 8080:80

# 查看事件
kubectl get events --all-namespaces --sort-by='.lastTimestamp'

# 节点维护
kubectl cordon node-name     # 标记不可调度
kubectl drain node-name --ignore-daemonsets  # 驱逐Pod
kubectl uncordon node-name   # 恢复调度

# 证书管理(使用cert-manager)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.0/cert-manager.yaml

# 自动签发Let's Encrypt证书
cat > cluster-issuer.yaml << EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@yourdomain.com
    privateKeySecretRef:
      name: letsencrypt-prod-key
    solvers:
    - http01:
        ingress:
          class: nginx
EOF
kubectl apply -f cluster-issuer.yaml

九、常见问题排查

问题原因解决
Node NotReadyCNI网络插件未部署确认Calico/Flannel正常运行
Pod Pending资源不足或PVC未绑定检查节点资源、StorageClass配置
Pod CrashLoopBackOff应用启动失败kubectl logs查看应用日志
ImagePullBackOff镜像拉取失败检查镜像名、仓库认证、网络
DNS解析失败CoreDNS异常kubectl rollout restart -n kube-system deploy/coredns
证书过期kubeadm证书有效期1年kubeadm certs renew all && systemctl restart kubelet

十、生产环境安全清单

  • RBAC权限控制: 禁止使用default ServiceAccount,为每个应用创建专用SA
  • NetworkPolicy: 默认拒绝入站流量,按需放行
  • Pod安全策略: 禁止特权容器、禁止root用户
  • 资源限额: 为每个Namespace设置ResourceQuota和LimitRange
  • 镜像扫描: 集成Trivy自动扫描镜像漏洞
  • 审计日志: 开启K8s审计日志,记录所有API操作
  • 密钥管理: 使用External Secrets Operator集成云KMS
  • 定期升级: 保持K8s版本在支持周期内(通常每个小版本支持14个月)

Kubernetes学习曲线陡峭,但掌握后能极大提升运维效率。建议从kubeadm搭建开始,逐步熟悉各个组件的工作原理,再考虑生产环境的自动化部署(如Rancher、KubeSpray等工具)。

🔗 相关文章:Docker容器安全指南 | MySQL主从复制与高可用 | Zabbix监控部署


相关文章

ELK(Elasticsearch + Logstash + Kibana)详细部署方法

ELK(Elasticsearch + Logstash + Kibana)详细部署方法 ELK是Elasticsearch、Logstash、Kibana三个开源工具的简称,是目前最流行的日志分析...

Nginx安全配置最佳实践(2026版)——抵御现代Web攻击的完整指南

Nginx安全配置最佳实践(2026版)——抵御现代Web攻击的完整指南 Nginx作为全球使用率最高的Web服务器之一,承载着大量关键业务。然而,默认配置下Nginx存在诸多安全隐患。本文从攻击者视...

Docker容器安全指南(2026版)——从镜像到运行时的全链路防护

Docker容器安全指南(2026版)——从镜像到运行时的全链路防护 Docker已成为云原生时代的标配,但"容器不等于安全"——默认配置下的Docker存在大量安全隐患。据统计,超过60%的容器安全...

Zabbix 7.0监控系统从零部署到生产实践(2026版)

Zabbix 7.0监控系统从零部署到生产实践(2026版) Zabbix是企业级开源监控解决方案的首选,支持网络设备、服务器、云资源等全方位监控。2025年发布的Zabbix 7.0带来了重大更新:...

MySQL主从复制与高可用架构实战指南(2026版)

MySQL主从复制与高可用架构实战指南(2026版) MySQL作为全球最流行的开源关系型数据库,在高并发场景下的高可用架构设计至关重要。本文从基础主从复制到Group Replication集群,提...

发表评论

访客

看不清,换一张

◎欢迎参与讨论,请在这里发表您的看法和观点。