基于REK部署k8s集群
介绍
特性
kubeadm
K3s
RKE2
定位
官方标准部署工具
极简 、轻量、边缘计算
安全 、企业合规、高一致性
安装方式
命令行工具,需手动配置较多参数
一键安装脚本 ,二进制直接运行
一键安装脚本 ,支持离线包
架构
完整K8s组件,分布式
单体二进制 ,合并了大部分控制平面组件
与上游完全一致 的组件,无魔改
资源占用
高(约1-2GB+)
极低 (约512MB)
中等(约1GB)
复杂度
复杂
非常简单
简单
数据库
etcd(分布式)
默认嵌入式SQLite ,可换etcd
etcd (仅分布式)
安全合规
需手动加固
基础安全配置
内置CIS安全基准 ,默认生产就绪
典型场景
学习、测试、对灵活性要求高的生产环境
边缘计算 、IoT、资源受限设备、快速原型
企业生产环境 、金融/政府等高安全要求场景
RKE1部署很快,就是可能有些镜像可能要准备迁移到rke2,命名空间会变,就拉取不下来,只能用旧一点的镜像,但是不知道这样会不会有问题,但是可以部署成功的
RKE2部署巨慢,就是他要启动,但是可能是下载镜像慢吧,如果有外网就还好,可能也慢
基础配置(rke1)
操作节点:[所有节点]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 cat > init.sh <<"EOF2" if [ $# -eq 1 ];then echo "设置主机名为:$1 " else echo "使用方法:sh $0 主机名" exit 2 fi echo "--------------------------------------" echo "1.正在设置主机名:$1 " hostnamectl set-hostname $1 echo "2.正在关闭ufw" sudo systemctl disable ufw --nowecho "3.添加hosts解析" add_hosts_entry () { local ip=$1 local hostname=$2 if grep -q "^$ip " /etc/hosts; then if ! grep -q "^$ip [[:space:]]\+$hostname " /etc/hosts; then sed -i "/^$ip /s/^$ip .*/$ip $hostname /" /etc/hosts echo "已更新: $ip -> $hostname " else echo "已存在: $ip -> $hostname " fi elif grep -q "^#$ip " /etc/hosts; then sed -i "/^#$ip /d" /etc/hosts echo "$ip $hostname " >> /etc/hosts echo "已删除注释并添加: $ip -> $hostname " else echo "$ip $hostname " >> /etc/hosts echo "已添加: $ip -> $hostname " fi } add_hosts_entry 192.168.48.10 master1 add_hosts_entry 192.168.48.20 master2 add_hosts_entry 192.168.48.201 node01 add_hosts_entry 192.168.48.202 node02 add_hosts_entry 192.168.48.128 etcd01 echo "4. 配置chrony时间同步" [ -f /etc/debian_version ] && apt install chrony -y CHRONY_CONF="/etc/chrony/chrony.conf" NEW_POOL="pool ntp.aliyun.com iburst maxsources 1" cp "$CHRONY_CONF " "$CHRONY_CONF .bak" 2>/dev/nullsed -i '/^[[:space:]]*pool/ s/^/#/; /ntp.aliyun.com/d' "$CHRONY_CONF " echo "$NEW_POOL " >> "$CHRONY_CONF " systemctl restart chronyd 2>/dev/null || systemctl restart chrony 2>/dev/null timedatectl set-timezone Asia/Shanghai timedatectl set-local-rtc 1 timedatectl set-ntp yes sleep 3chronyc -a makestep &>/dev/null echo "5. 关闭swap分区" sed -ri 's/^([^#].*swap.*)/#\1/' /etc/fstab swapoff -a echo "6. 设置内核参数" cat >/etc/sysctl.d/11-rke2k8s.conf<<"EOF" net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-ip6tables=1 net.bridge.bridge-nf-call-iptables=1 EOF modprobe br_netfilter cat > /etc/modules-load.d/k8s.conf <<EOF br_netfilter EOF sysctl --system &> /dev/null echo "验证内核参数:" sysctl net.ipv4.ip_forward sysctl net.bridge.bridge-nf-call-iptables sysctl net.bridge.bridge-nf-call-ip6tables 2>/dev/null || echo "bridge模块未加载" EOF2
看清楚哪个节点用哪个命令
1 2 3 4 5 bash init.sh master1 bash init.sh master2 bash init.sh node01 bash init.sh node02 bash init.sh etcd01
然后所有节点输入bash,更新主机名,确保所有主机名都已经更新即可
基础配置(rke2)
基础配置
操作节点:[所有节点]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 cat > init.sh <<"EOF2" if [ $# -eq 1 ];then echo "设置主机名为:$1 " else echo "使用方法:sh $0 主机名" exit 2 fi echo "--------------------------------------" echo "1.正在设置主机名:$1 " hostnamectl set-hostname $1 echo "2.正在关闭ufw" sudo systemctl disable ufw --nowecho "3.添加hosts解析" add_hosts_entry () { local ip=$1 local hostname=$2 if grep -q "^$ip " /etc/hosts; then if ! grep -q "^$ip [[:space:]]\+$hostname " /etc/hosts; then sed -i "/^$ip /s/^$ip .*/$ip $hostname /" /etc/hosts echo "已更新: $ip -> $hostname " else echo "已存在: $ip -> $hostname " fi elif grep -q "^#$ip " /etc/hosts; then sed -i "/^#$ip /d" /etc/hosts echo "$ip $hostname " >> /etc/hosts echo "已删除注释并添加: $ip -> $hostname " else echo "$ip $hostname " >> /etc/hosts echo "已添加: $ip -> $hostname " fi } add_hosts_entry 192.168.48.10 master1 add_hosts_entry 192.168.48.20 master2 add_hosts_entry 192.168.48.201 master3 add_hosts_entry 192.168.48.202 node01 echo "4. 配置chrony时间同步" [ -f /etc/debian_version ] && apt install chrony -y CHRONY_CONF="/etc/chrony/chrony.conf" NEW_POOL="pool ntp.aliyun.com iburst maxsources 1" cp "$CHRONY_CONF " "$CHRONY_CONF .bak" 2>/dev/nullsed -i '/^[[:space:]]*pool/ s/^/#/; /ntp.aliyun.com/d' "$CHRONY_CONF " echo "$NEW_POOL " >> "$CHRONY_CONF " systemctl restart chronyd 2>/dev/null || systemctl restart chrony 2>/dev/null timedatectl set-timezone Asia/Shanghai timedatectl set-local-rtc 1 timedatectl set-ntp yes sleep 3chronyc -a makestep &>/dev/null echo "5. 关闭swap分区" sed -ri 's/^([^#].*swap.*)/#\1/' /etc/fstab swapoff -a echo "6. 设置内核参数" cat >/etc/sysctl.d/11-rke2k8s.conf<<"EOF" net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-ip6tables=1 net.bridge.bridge-nf-call-iptables=1 EOF modprobe br_netfilter cat > /etc/modules-load.d/k8s.conf <<EOF br_netfilter EOF sysctl --system &> /dev/null echo "验证内核参数:" sysctl net.ipv4.ip_forward sysctl net.bridge.bridge-nf-call-iptables sysctl net.bridge.bridge-nf-call-ip6tables 2>/dev/null || echo "bridge模块未加载" EOF2
看清楚哪个节点用哪个命令
1 2 3 4 bash init.sh master1 bash init.sh master2 bash init.sh master3 bash init.sh node01
然后所有节点输入bash,更新主机名,确保所有主机名都已经更新即可
实现ssh免密
操作节点:[master1]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 apt install -y sshpass cat > sshmianmi.sh << "EOF" TARGET_USER="root" hosts=("master1" "master2" "master3" "node01" ) PASSWORD="123456" SSH_KEY="$HOME /.ssh/id_rsa" apt install -y sshpass if [ ! -f "$SSH_KEY " ]; then ssh-keygen -t rsa -b 2048 -N "" -f "$SSH_KEY " fi for host in "${hosts[@]} " ; do echo "====== 配置 ${TARGET_USER} @${host} ======" sshpass -p "$PASSWORD " ssh-copy-id \ -o StrictHostKeyChecking=no \ "${TARGET_USER} @${host} " ssh "${TARGET_USER} @${host} " \ "echo '[${host} ] ${TARGET_USER} 免密登录成功'" done echo "🎉 SSH 免密登录全部配置完成" EOF bash sshmianmi.sh
安装ipvsadm
操作节点:[所有节点]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 sudo apt install -y ipset ipvsadmcat << EOF | sudo tee /etc/modules-load.d/ipvs.conf ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack EOF sudo modprobe ip_vssudo modprobe ip_vs_rr sudo modprobe ip_vs_wrr sudo modprobe ip_vs_sh sudo modprobe nf_conntrack lsmod | grep ip_vs
RKE1部署
RKE1可能在未来不再更新,可以考虑RKE2
主机
ip
cpu
内存
硬盘
角色
master1
192.168.48.10
2h
3G
100G
Controlplane、rancher、rke
master2
192.168.48.20
2h
3G
100G
Controlplane
node01
192.168.48.201
2h
3G
100G
Worker
node02
192.168.48.202
2h
3G
100G
Worker
etcd01
192.168.48.128
2h
3G
100G
Etcd
机子要求能访问外网 ,如果实在不行,就只能提前把镜像下载好,后面会说
自行设置好ip哈,这里就不讲解了,本次系统用的是ubuntu 24.02LTS ,以下存在的命令可能有系统上的差别自行改改就行了
前置检查(所有节点)
你前面【2.基础配置(rke1)】 里已经做了 hostname / hosts / swap / 内核参数,还有【4.安装ipvsadm】 安装ipvsadm
安装docker
操作节点:[所有节点]
全选复制粘贴即可
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 cat >docker-install.sh<<"EOF5" DEFAULT_DOCKER_VERSION="29.2.1" DOCKER_VERSION="${1:-$DEFAULT_DOCKER_VERSION } " DOCKER_TGZ_URL="https://mirrors.aliyun.com/docker-ce/linux/static/stable/x86_64/docker-${DOCKER_VERSION} .tgz" sysinit_info (){ . /etc/os-release Dev_name=$(ip a | sed -En 's#^2: ([^ ]+):.*$#\1#p' ) local_ip=$(ip route get 1 | awk '{print $7; exit}' ) IP_PREFIX=$(ip a | grep $Dev_name | sed -En '/inet/s#.*inet ([^ ]+)/([0-9]{0,2}) .*#\2#p' ) NET_NUM=${local_ip%.*} . CPUS=$(nproc ) } color (){ local nl =$'\n' case $1 in -n) nl ="" ; shift ;; esac case $# in 1) color_name=white; text=$1 ;; 2) color_name=$(printf '%s' "$1 " | tr 'A-Z' 'a-z' ); text=$2 ;; *) echo -e "用法:color [颜色] \"文本\"" echo "颜色列表:black red green yellow blue purple cyan" return 1 ;; esac case $color_name in black) col=30 ;; red) col=31 ;; green) col=32 ;; yellow) col=33 ;; blue) col=34 ;; purple) col=35 ;; cyan) col=36 ;; *) col=37 ;; esac printf '\033[%sm%s\033[0m%s' "$col " "$text " "$nl " } docker_uninstall (){ color yellow "开始卸载 Docker" apt purge -y docker* &>/dev/null yum remove -y docker* &>/dev/null rm -rf /usr/local/bin/docker* rm -rf /etc/docker rm -rf /etc/systemd/system/docker* rm -rf /etc/systemd/system/containerd.service rm -rf /etc/systemd/system/docker.socket rm -rf /etc/docker/daemon.json color green "卸载完成" } docker_install (){ local docker_pkg1="docker-${DOCKER_VERSION} .tgz" local docker_pkg2="docker.tgz" if [[ -f $docker_pkg1 || -f $docker_pkg2 ]]; then color red "检测到本地安装包 $( [[ -f $docker_pkg1 ]] && echo $docker_pkg1 ) $( [[ -f $docker_pkg1 && -f $docker_pkg2 ]] && echo ' 和 ' ) $( [[ -f $docker_pkg2 ]] && echo $docker_pkg2 ) ,是否需要删除本地安装包?(y/n)[n]" read -r -p "" input input=${input:-n} if [[ $input == y || $input == Y ]]; then [[ -f $docker_pkg1 ]] && rm -f $docker_pkg1 && color yellow "已删除 $docker_pkg1 " [[ -f $docker_pkg2 ]] && rm -f $docker_pkg2 && color yellow "已删除 $docker_pkg2 " else color green "保留本地安装包" fi fi color yellow "开始安装依赖,卸载旧版包并安装Docker" if [[ "$ID " =~ ubuntu ]]; then apt install -y wget bash-completion &>/dev/null || (apt update -y &>/dev/null && apt install -y wget bash-completion &>/dev/null) apt purge -y docker docker-client docker-client-latest docker-common \ docker-latest docker-latest-logrotate docker-logrotate docker-selinux \ docker-engine-selinux docker-engine docker* &>/dev/null elif [[ "$ID " =~ centos|rhel|rocky|openeuler ]]; then yum install -y wget bash-completion &>/dev/null yum remove -y docker docker-client docker-client-latest docker-common \ docker-latest docker-latest-logrotate docker-logrotate docker-selinux \ docker-engine-selinux docker-engine docker* &>/dev/null else color red "不支持的操作系统: $ID " exit 1 fi if [ -f docker-${DOCKER_VERSION} .tgz ] || [ -f docker.tgz ]; then color green "本地已有 docker-${DOCKER_VERSION} .tgz 或 docker.tgz,使用本地包" mv docker-${DOCKER_VERSION} .tgz docker.tgz &>/dev/null else color yellow "本地无包,开始下载" wget $DOCKER_TGZ_URL -O docker.tgz [ ! -f docker.tgz ] && color red "下载失败" && exit 1 fi if ! tar -xvf docker.tgz -C /usr/local/bin --strip-components 1; then color red "解压失败,尝试重新下载" rm -f docker.tgz wget -O docker.tgz "$DOCKER_TGZ_URL " || { color red "重新下载失败" ; exit 1; } tar -xvf docker.tgz -C /usr/local/bin --strip-components 1 || { color red "解压仍然失败" ; exit 1; } fi cat > /etc/systemd/system/docker.service <<EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com After=network-online.target firewalld.service Wants=network-online.target [Service] Type=notify ExecStart=/usr/local/bin/dockerd --config-file=/etc/docker/daemon.json ExecReload=/bin/kill -s HUP \$MAINPID TimeoutSec=0 RestartSec=2 Restart=always StartLimitBurst=3 StartLimitInterval=60s LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity Delegate=yes KillMode=process OOMScoreAdjust=-500 [Install] WantedBy=multi-user.target EOF getent group docker || groupadd docker mkdir -p /etc/docker cat > /etc/docker/daemon.json <<'EOF' { "registry-mirrors" : [ "https://docker.xuanyuan.me" , "https://docker.m.daocloud.io" , "https://docker.1ms.run" , "https://run-docker.cn" , "https://docker.sunzishaokao.com" , "https://docker.1panel.live" , "https://registry.cn-hangzhou.aliyuncs.com" , "https://docker.qianyios.top" , "https://registry.aliyuncs.com" , "https://ghcr.nju.edu.cn" , "https://k8s.nju.edu.cn" ], "hosts" : ["unix:///var/run/docker.sock" ], "max-concurrent-downloads" : 10, "log-driver" : "json-file" , "log-level" : "warn" , "log-opts" : { "max-size" : "10m" , "max-file" : "3" }, "data-root" : "/var/lib/docker" , "live-restore" : true } EOF systemctl daemon-reload systemctl enable --now docker.service systemctl restart docker if [ $? -eq 0 ]; then color green "Docker ${DOCKER_VERSION} 安装成功" else color red "安装失败" exit 1 fi } menu (){ color "================================ 欢迎使用Docker安装脚本 ================================" color blue "作者:严千屹" color green "阿里云镜像站:https://mirrors.aliyun.com/docker-ce/linux/static/stable/x86_64" color -n blue "目前默认版本 ${DEFAULT_DOCKER_VERSION} ,如需更换版本请修改脚本中的 " ; color red "DEFAULT_DOCKER_VERSION" color blue "本地存在 docker.tgz 时将优先使用,无则从远程下载" color "======================================================================================" color blue "1. 安装 Docker" color blue "2. 卸载 Docker" color "============================================================================" read -p "请输入数字:" choice case $choice in 1) docker_install ;; 2) docker_uninstall ;; *) color red "输入错误,请重新输入" ;; esac } main (){ sysinit_info if [ $# -ge 1 ] && [ -n "$1 " ]; then docker_install color green "如需卸载,请运行:bash docker-install.sh 查看菜单" exit 0 else menu fi } main "$@ " EOF5
所有节点运行
可以指定版本,但是前提是你要在这看看有什么版本https://mirrors.aliyun.com/docker-ce/linux/static/stable/x86_64/
注意RKE1安装方式 目前最新版是1.8.10,他要求的的docker客户端api是1.41,而与之匹配的,只能是docker-20.10.9,脚本里默认配置的是29.2.1 api版本是1.53,这样导致rke1的方式部署不了
1 2 bash docker-install.sh 20.10.9
安装docker-compose
操作节点:[所有节点]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 cat > docker-compose-install.sh <<"EOF5" function_list (){ sysinit_info (){ . /etc/os-release Dev_name=$(ip a | sed -En 's#^2: ([^ ]+):.*$#\1#p' ) local_ip=$(ip route get 1 | awk '{print $7; exit}' ) IP_PREFIX=$(ip a | grep $Dev_name | sed -En '/inet/s#.*inet ([^ ]+)/([0-9]{0,2}) .*#\2#p' ) NET_NUM=${local_ip%.*} . local_passwd=123456 CPUS=$(nproc ) } sysinit_info color (){ local nl =$'\n' case $1 in -n) nl ="" ; shift ;; esac case $# in 1) text=$1 color_name=white ;; 2) color_name=$(printf '%s' "$1 " | tr 'A-Z' 'a-z' ) text=$2 ;; *) echo -e "用法:color [颜色] \"文本\"" echo "颜色列表:black、red、green、yellow、blue、purple、cyan(青色)" return 1 ;; esac case $color_name in black) col=30 ;; red) col=31 ;; green) col=32 ;; yellow) col=33 ;; blue) col=34 ;; purple) col=35 ;; cyan) col=36 ;; *) col=37 ;; esac printf '\033[%sm%s\033[0m%s' "$col " "$text " "$nl " } install_docker_compose (){ os_name="$(uname -s) " arch_name="$(uname -m) " url="https://github.com/docker/compose/releases/latest/download/docker-compose-${os_name} -${arch_name} " URL1="https://ghproxy.com/${url} " URL2="https://ghfast.top/${url} " TARGET="/usr/local/bin/docker-compose" is_valid_compose_binary (){ local file_path="$1 " [ -f "$file_path " ] || return 1 if LC_ALL=C head -c 256 "$file_path " 2>/dev/null | tr -d '\000' | grep -qiE '<!doctype html|<html' ; then return 1 fi chmod +x "$file_path " 2>/dev/null "$file_path " --version >/dev/null 2>&1 return $? } if is_valid_compose_binary /usr/local/bin/docker-compose; then color green "docker-compose 已安装且可用" ln -sf /usr/local/bin/docker-compose /usr/bin/docker-compose &> /dev/null exit 0 elif is_valid_compose_binary /usr/bin/docker-compose; then color green "docker-compose 已安装且可用" ln -sf /usr/bin/docker-compose /usr/local/bin/docker-compose &> /dev/null exit 0 else rm -f /usr/local/bin/docker-compose /usr/bin/docker-compose fi tmp_file="$(mktemp /tmp/docker-compose.XXXXXX) " download_ok=0 for u in "$URL1 " "$URL2 " "$url " ; do rm -f "$tmp_file " color blue "尝试下载:$u " if curl -fL --connect-timeout 10 --max-time 120 --retry 2 --retry-delay 2 "$u " -o "$tmp_file " ; then if is_valid_compose_binary "$tmp_file " ; then mv -f "$tmp_file " "$TARGET " download_ok=1 break else color yellow "下载内容不是可执行 docker-compose(二进制可能被替换为 HTML 页面),换源重试" fi fi done if [ $download_ok -ne 1 ]; then rm -f "$tmp_file " "$TARGET " color red "docker-compose 下载失败:所有源均不可用或返回了非二进制内容" exit 1 fi chmod +x "$TARGET " ln -sf "$TARGET " /usr/bin/docker-compose /usr/local/bin/docker-compose --version if [ $? -eq 0 ]; then color green "docker-compose 安装成功" color yellow "如需卸载请运行: bash $0 " else color red "docker-compose 安装失败" fi } uninstall_docker_compose (){ rm -f /usr/local/bin/docker-compose rm -f /usr/bin/docker-compose docker-compose --version &> /dev/null if [ $? -eq 0 ];then color red "docker-compose卸载失败" else color green "docker-compose卸载成功" fi } menu (){ color "========================欢迎来到Docker-compose安装脚本=========================" color blue "作者:严千屹" echo "" color green "将自动安装 docker-compose 最新版(无需输入版本号)" color blue "请选择安装方式:" color green "1.安装docker-compose" color green "2.卸载docker-compose" color "============================================================================" read -p "请输入数字:" num case $num in 1) install_docker_compose ;; 2) uninstall_docker_compose ;; *) color red "请输入正确的数字" ;; esac } } main (){ function_list if [ "$1 " = "latest" ]; then install_docker_compose exit 0 fi menu } main "$@ " EOF5
所有节点运行
1 bash docker-compose-install.sh latest
添加Rancher用户
操作节点:[所有节点]
✅在生产环境中,出于安全性与权限最小化原则的考虑,不建议使用 root 用户直接操作 Docker。因此本文将创建一个专用用户,用于执行 Docker 相关操作。
1 2 3 4 5 6 useradd -d /home/rancher -m -s /bin/bash rancher || true usermod -aG docker rancher echo "rancher:123456" | chpasswd
对RKE所在的节点进行免密操作
操作节点:[master1]
通过【1. 主机拓扑】 可以看出,RKE 所在节点为 master1 ,因此只需要在 master1 节点上生成 SSH 公钥,并将其分发到其他节点即可。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 apt install -y sshpass cat > sshmianmi.sh << "EOF" TARGET_USER="rancher" hosts=("master1" "master2" "node01" "node02" "etcd01" ) PASSWORD="123456" SSH_KEY="$HOME /.ssh/id_rsa" apt install -y sshpass if [ ! -f "$SSH_KEY " ]; then ssh-keygen -t rsa -b 2048 -N "" -f "$SSH_KEY " fi for host in "${hosts[@]} " ; do echo "====== 配置 ${TARGET_USER} @${host} ======" sshpass -p "$PASSWORD " ssh-copy-id \ -o StrictHostKeyChecking=no \ "${TARGET_USER} @${host} " ssh "${TARGET_USER} @${host} " \ "echo '[${host} ] ${TARGET_USER} 免密登录成功'" done echo "🎉 SSH 免密登录全部配置完成" EOF bash sshmianmi.sh
下载RKE1二进制文件
https://github.com/rancher/rke/releases
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 #!/bin/bash rm -f /usr/local/bin/rkeRKE_URL="https://github.com/rancher/rke/releases/download/v1.8.10/rke_linux-amd64" LOCAL_PATH="/usr/local/bin/rke" URL1="https://ghproxy.com/${RKE_URL} " URL2="https://ghfast.top/${RKE_URL} " URLS=("$URL1 " "$URL2 " "$RKE_URL " ) success=false for url in "${URLS[@]} " ; do echo "尝试下载:$url " curl -fsSL -o "$LOCAL_PATH " "$url " if file "$LOCAL_PATH " | grep -q "ELF" ; then echo "下载成功:$url " success=true break else echo "下载的文件不是二进制,尝试下一个地址..." rm -f "$LOCAL_PATH " fi done if [ "$success " = false ]; then echo "所有下载地址均失败,请检查网络或 URL" exit 1 fi chmod +x "$LOCAL_PATH " rke --version
基于RKE1生成初始化文件
1 2 3 4 5 6 RKE_DIR="/etc/rancher/rke" mkdir -p $RKE_DIR cd $RKE_DIR rke config --name cluster.yml
在确认一点,现在我们只需要最小化的k8s先做出来,后面我们演示如何将其他节点加入进来,也就是我们只需要一个master1,node01和etcd01即可,所以节点只有3个
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 root@master1:/etc/rancher/rke# rke config --name cluster.yml [+] Cluster Level SSH Private Key Path [~/.ssh/id_rsa]: [+] Number of Hosts [1]: 3 [+] SSH Address of host (1) [none]: 192.168.48.10 [+] SSH Port of host (1) [22]: [+] SSH Private Key Path of host (192.168.48.10) [none]: ~/.ssh/id_rsa [+] SSH User of host (192.168.48.10) [ubuntu]: rancher [+] Is host (192.168.48.10) a Control Plane host (y/n)? [y]: y [+] Is host (192.168.48.10) a Worker host (y/n)? [n]: n [+] Is host (192.168.48.10) an etcd host (y/n)? [n]: n [+] Override Hostname of host (192.168.48.10) [none]: [+] Internal IP of host (192.168.48.10) [none]: [+] Docker socket path on host (192.168.48.10) [/var/run/docker.sock]: [+] SSH Address of host (2) [none]: 192.168.48.201 [+] SSH Port of host (2) [22]: [+] SSH Private Key Path of host (192.168.48.201) [none]: ~/.ssh/id_rsa [+] SSH User of host (192.168.48.201) [ubuntu]: rancher [+] Is host (192.168.48.201) a Control Plane host (y/n)? [y]: n [+] Is host (192.168.48.201) a Worker host (y/n)? [n]: y [+] Is host (192.168.48.201) an etcd host (y/n)? [n]: n [+] Override Hostname of host (192.168.48.201) [none]: [+] Internal IP of host (192.168.48.201) [none]: [+] Docker socket path on host (192.168.48.201) [/var/run/docker.sock]: [+] SSH Address of host (3) [none]: 192.168.48.128 [+] SSH Port of host (3) [22]: [+] SSH Private Key Path of host (192.168.48.128) [none]: ~/.ssh/id_rsa [+] SSH User of host (192.168.48.128) [ubuntu]: rancher [+] Is host (192.168.48.128) a Control Plane host (y/n)? [y]: n [+] Is host (192.168.48.128) a Worker host (y/n)? [n]: n [+] Is host (192.168.48.128) an etcd host (y/n)? [n]: y [+] Override Hostname of host (192.168.48.128) [none]: [+] Internal IP of host (192.168.48.128) [none]: [+] Docker socket path on host (192.168.48.128) [/var/run/docker.sock]: [+] Network Plugin Type (flannel, calico, weave, canal, aci) [canal]: calico [+] Authentication Strategy [x509]: [+] Authorization Mode (rbac, none) [rbac]: [+] Kubernetes Docker image [rke-extended-life/hyperkube:v1.32.11-rancher1]: [+] Cluster domain [cluster.local]: [+] Service Cluster IP Range [10.43.0.0/16]: [+] Cluster Network CIDR [10.42.0.0/16]: [+] Cluster DNS Service IP [10.43.0.10]: [+] Add addon manifest URLs or YAML files [no]:
为了更好的安全,我们还需要进行一些配置
如配置ipvs
1.配置ipvs
如果你没有安装ipvsadm,请前往4.安装ipvsadm(可选)
集群部署
1 2 RKE_DIR="/etc/rancher/rke" rke up --config $RKE_DIR /cluster.yml
可能需要拉取镜像,没有外网访问的能力的话,也可能拉的下来,拉不下来的镜像,只能看看初始化文件里的镜像,一个一个去找渡渡鸟镜像站 ,或者你自己配置轩辕镜像站然后下载下来,然后重新tag,但是我觉得还得要挂tun模式才能访问外网
rke-extended-life/hyperkube:v1.32.11-rancher1]
正好就来了,可能太老了,镜像要往rke2进行迁移,这个镜像拉取不下来,所以我就去了dockerhub一看根本就没有rke-extended-life/hyperkube只有rancher/hyperkube:v1.32.6-rancher1
那这样子我就去初始化文件改了地址
1 2 3 4 RKE_DIR="/etc/rancher/rke" sed -i 's#rke-extended-life/hyperkube:v1.32.11-rancher1#rancher/hyperkube:v1.32.6-rancher1#g' $RKE_DIR /cluster.yml rke up --config $RKE_DIR /cluster.yml
然后就重新运行安装
过一会就部署成功,现在我们缺少一个kubectl的客户端工具
安装kebectl
https://cjyabraham.gitlab.io/docs/tasks/tools/install-kubectl/#install-kubectl-binary-using-native-package-management
1 2 3 curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl chmod +x ./kubectlsudo mv ./kubectl /usr/local/bin/kubectl
1 2 3 4 5 6 apt install bash-completion -y echo 'source /usr/share/bash-completion/bash_completion' >> ~/.bashrcecho 'source <(kubectl completion bash)' >> ~/.bashrcsource ~/.bashrc
此时的kubectl没有认证是无法访问k8s集群的
添加集群认证
1 2 3 4 5 6 7 RKE_DIR="/etc/rancher/rke" cd $RKE_DIR root@master1:/etc/rancher/rke# ll ..... cluster.rkestate ..... cluster.yml ..... kube_config_cluster.yml
将这个文件转移到家目录下的.kube即可
1 2 3 4 5 6 7 cat > kube-config.sh <<"EOF" RKE_DIR="/etc/rancher/rke" mkdir -p /root/.kubecp $RKE_DIR /kube_config_cluster.yml /root/.kube/configchmod 600 /root/.kube/configEOF bash kube-config.sh
这时候再次访问
1 2 3 4 5 root@master1:/etc/rancher/rke# kubectl get nodes NAME STATUS ROLES AGE VERSION 192.168.48.10 Ready controlplane 8m29s v1.32.6 192.168.48.128 Ready etcd 8m28s v1.32.6 192.168.48.201 Ready worker 8m27s v1.32.6
添加节点(所有类型的节点都通用)
添加节点的前提条件,请一定要做初始配置,就是要做到【6.对RKE所在的节点进行免密操作】 ,好在我们前面都做了,如果你要新增节点,前面的一些系统安全配置一点要做哈,特别是ssh免密
修改配置文件
1 2 RKE_DIR="/etc/rancher/rke" vim $RKE_DIR /cluster.yml
1 2 RKE_DIR="/etc/rancher/rke" rke up --config $RKE_DIR /cluster.yml
这时候就会重新部署了
1 2 3 4 INFO[0000] [dialer] Setup tunnel for host [192.168.48.201] INFO[0000] [dialer] Setup tunnel for host [192.168.48.128] INFO[0000] [dialer] Setup tunnel for host [192.168.48.20] INFO[0000] [dialer] Setup tunnel for host [192.168.48.10]
已经更新了,是不是很方便
1 2 3 4 5 6 root@master1:/etc/rancher/rke# kubectl get nodes NAME STATUS ROLES AGE VERSION 192.168.48.10 Ready controlplane 20m v1.32.6 192.168.48.128 Ready etcd 20m v1.32.6 192.168.48.20 Ready controlplane 107s v1.32.6 192.168.48.201 Ready worker 20m v1.32.6
如果你要组成高可用,这里的添加etcd节点的操作和前面的一样,但是最小化的高可用一点要etcd有3个节点,必须大于三个这是铁律,我这里就不演示了
删除节点,就把刚刚在配置文件加的删除 就行了
然后切记!!!重新rke up 即可
卸载RKE1部署的k8s
1 2 RKE_DIR="/etc/rancher/rke" rke remove --config $RKE_DIR /cluster.yml
RKE2部署(在线)
https://documentation.suse.com/cloudnative/rke2/latest/zh/install/methods.html
RKE2 默认使用内部 containerd 。不管是在线还是离线部署,每一个节点启动的时间都可能要很久很久。离线的会快一点
主机拓扑
最小化部署(单master)
主机
ip
角色
master1
192.168.48.10
server(controlplane + etcd)
node01
192.168.48.202
agent(worker)
一般来说,如果只是单master,master1 + node01 就可以先跑起来。
高可用部署(推荐生产)
主机
ip
角色
master1
192.168.48.10
server(controlplane + etcd)
master2
192.168.48.20
server(controlplane + etcd)
master3
192.168.48.201
server(controlplane + etcd)
node01
192.168.48.202
agent(worker)
如果要做高可用,必须至少 3 台 master(server) 才能保证 etcd 仲裁。
RKE2 的 etcd 是内部组件(静态 Pod) ,不需要像 RKE1 一样单独拿机器做外置 etcd 节点。
前置检查(所有节点)
你前面【3.基础配置(rke2)】 里已经做了 hostname / hosts / swap / 内核参数,还有【4.安装ipvsadm】 安装ipvsadm
(可选)配置国内镜像加速 / 私有仓库
建议放在最前面先做:这样 RKE2 首次启动拉镜像时就会直接走镜像站,成功率更高。
如果外网不稳,建议在 所有节点 先加 registries.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 sudo mkdir -p /etc/rancher/rke2sudo tee /etc/rancher/rke2/registries.yaml > /dev/null <<'EOF' mirrors: docker.io: endpoint: - "https://registry.cn-hangzhou.aliyuncs.com/" - "https://docker.xuanyuan.me" - "https://docker.m.daocloud.io" - "https://docker.1ms.run" - "https://docker.1panel.live" - "https://hub.rat.dev" - "https://docker-mirror.aigc2d.com" - "https://docker.qianyios.top/" quay.io: endpoint: - "https://quay.tencentcloudcr.com/" registry.k8s.io: endpoint: - "https://registry.aliyuncs.com/v2/google_containers" gcr.io: endpoint: - "https://gcr.m.daocloud.io/" k8s.gcr.io: endpoint: - "https://registry.aliyuncs.com/google_containers" ghcr.io: endpoint: - "https://ghcr.m.daocloud.io/" EOF
在首台控制面安装 RKE2 Server(master1)
操作节点:master1
先创建配置目录与配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 mkdir -p /etc/rancher/rke2cat > /etc/rancher/rke2/config.yaml <<'EOF' token: "123456" system-default-registry: registry.cn-hangzhou.aliyuncs.com tls-san: - "192.168.48.10" cni: "calico" kube-proxy-arg: - "proxy-mode=ipvs" - "ipvs-strict-arp=true" EOF
安装并启动:
1 2 curl -sfL https://rancher-mirror.rancher.cn/rke2/install.sh | INSTALL_RKE2_MIRROR=cn sh - systemctl enable rke2-server --now
如国内镜像地址不可用,再回退官方地址:
1 curl -sfL https://get.rke2.io | sh -
查看启动状态:
1 2 systemctl status rke2-server --no-pager journalctl -u rke2-server -f
首次初始化后,获取 join token:(token已经在配置文件写了,这里只是告诉你们怎么看)
1 cat /var/lib/rancher/rke2/server/node-token
配置 kubect、crictl、ctr工具
操作节点:[所有的master节点]
等所有的master节点的rke2-server 都启动成功了,再运行下面的命令吧
因为下面有些文件需要等运行成功才会生成,没启动成功直接运行会报错的
存储在/etc/rancher/rke2/rke2.yaml的 kubeconfig 文件用来配置对 Kubernetes 集群的访问,且会有一些工具如(containerd,containerd-shim-runc-v2,crictl,ctr kubectl,kubelet,runc)存放在/var/lib/rancher/rke2/bin
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 mkdir -p ~/.kubecp /etc/rancher/rke2/rke2.yaml ~/.kube/configchmod 600 ~/.kube/configcat > /etc/crictl.yaml <<'EOF' runtime-endpoint: unix:///run/k3s/containerd/containerd.sock image-endpoint: unix:///run/k3s/containerd/containerd.sock timeout : 10debug: false EOF grep -q 'export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock' ~/.bashrc || echo 'export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock' >> ~/.bashrc grep -q 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' ~/.bashrc || echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc grep -q 'export PATH=\$PATH:/var/lib/rancher/rke2/bin' ~/.bashrc || echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc source ~/.bashrcapt install bash-completion -y grep -q 'source /usr/share/bash-completion/bash_completion' ~/.bashrc || echo 'source /usr/share/bash-completion/bash_completion' >> ~/.bashrc grep -q 'source <(kubectl completion bash)' ~/.bashrc || echo 'source <(kubectl completion bash)' >> ~/.bashrc source ~/.bashrc
验证:
1 2 3 kubectl get pod -A crictl images ls ctr -n k8s.io images ls
(可选)增加另外两台控制面(master2、master3)
操作节点:master2、master3
server 节点必须指向已有 server 的 server: https://<IP>:9345。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 mkdir -p /etc/rancher/rke2cat > /etc/rancher/rke2/config.yaml <<'EOF' server: https://192.168.48.10:9345 token: "123456" system-default-registry: registry.cn-hangzhou.aliyuncs.com cni: "calico" kube-proxy-arg: - "proxy-mode=ipvs" - "ipvs-strict-arp=true" EOF curl -sfL https://get.rke2.io | sh - systemctl enable rke2-server --now
建议国内环境替换为:
1 2 curl -sfL https://rancher-mirror.rancher.cn/rke2/install.sh | INSTALL_RKE2_MIRROR=cn sh - systemctl enable rke2-server --now
回到 master1 检查:
正常应看到三台 server 节点逐步变为 Ready。
增加工作节点(node01)
操作节点:node01
1 2 3 4 5 6 7 8 9 10 11 12 13 mkdir -p /etc/rancher/rke2cat > /etc/rancher/rke2/config.yaml <<'EOF' server: https://192.168.48.10:9345 token: "123456" system-default-registry: registry.cn-hangzhou.aliyuncs.com kube-proxy-arg: - "proxy-mode=ipvs" - "ipvs-strict-arp=true" EOF curl -sfL https://rancher-mirror.rancher.cn/rke2/install.sh | INSTALL_RKE2_MIRROR=cn INSTALL_RKE2_TYPE="agent" sh - systemctl enable rke2-agent --now
如国内镜像不可用,再回退官方地址:
1 curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -
在 master1 校验:
1 kubectl get nodes -o wide
卸载 RKE2(按需)
1 2 3 4 5 rke2-uninstall.sh rke2-agent-uninstall.sh
RKE2部署(离线)
https://docs.rancher.cn/docs/rke2/install/airgap/
RKE2 默认使用内部 containerd 。不管是在线还是离线部署,每一个节点启动的时间都可能要很久很久。
主机拓扑
最小化部署(单master)
主机
ip
角色
master1
192.168.48.10
server(controlplane + etcd)
node01
192.168.48.202
agent(worker)
一般来说,如果只是单master,master1 + node01 就可以先跑起来。
高可用部署(推荐生产)
主机
ip
角色
master1
192.168.48.10
server(controlplane + etcd)
master2
192.168.48.20
server(controlplane + etcd)
master3
192.168.48.201
server(controlplane + etcd)
node01
192.168.48.202
agent(worker)
如果要做高可用,必须至少 3 台 master(server) 才能保证 etcd 仲裁。
RKE2 的 etcd 是内部组件(静态 Pod) ,不需要像 RKE1 一样单独拿机器做外置 etcd 节点。
前置检查(所有节点)
你前面【3.基础配置(rke2)】 里已经做了 hostname / hosts / swap / 内核参数,还有【4.安装ipvsadm】 安装ipvsadm
(可选)配置国内镜像加速 / 私有仓库
建议放在最前面先做:这样 RKE2 首次启动拉镜像时就会直接走镜像站,成功率更高。
如果外网不稳,建议在 所有 RKE2 节点 先加 registries.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 sudo mkdir -p /etc/rancher/rke2sudo tee /etc/rancher/rke2/registries.yaml > /dev/null <<'EOF' mirrors: docker.io: endpoint: - "https://registry.cn-hangzhou.aliyuncs.com/" - "https://docker.xuanyuan.me" - "https://docker.m.daocloud.io" - "https://docker.1ms.run" - "https://docker.1panel.live" - "https://hub.rat.dev" - "https://docker-mirror.aigc2d.com" - "https://docker.qianyios.top/" quay.io: endpoint: - "https://quay.tencentcloudcr.com/" registry.k8s.io: endpoint: - "https://registry.aliyuncs.com/v2/google_containers" gcr.io: endpoint: - "https://gcr.m.daocloud.io/" k8s.gcr.io: endpoint: - "https://registry.aliyuncs.com/google_containers" ghcr.io: endpoint: - "https://ghcr.m.daocloud.io/" EOF
准备离线资源包
https://github.com/rancher/rke2/releases
说明:离线包建议统一放在 /root/rke2-artifacts,后续所有节点都按这个路径来,省得反复改命令。
操作节点:master1
全选粘贴运行,我后续会选择calico,如果你用别的,你自己要注意在后续的配置文件要更换你的网络插件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 cat > rke2-source-download.sh <<"EOF2" set -eRKE2_URL_VERSION="v1.35.1+rke2r1" TARGET_DIR="/root/rke2-artifacts" PROXY_URLS=("https://ghfast.top/" "https://ghproxy.com/" ) BASE_URL="https://github.com/rancher/rke2/releases/download/${RKE2_URL_VERSION} " INSTALL_SCRIPT_URL="https://get.rke2.io" ARCH="amd64" color () { local color_name=$1 local text=$2 case $color_name in red) echo -e "\033[31m${text} \033[0m" ;; green) echo -e "\033[32m${text} \033[0m" ;; yellow) echo -e "\033[33m${text} \033[0m" ;; blue) echo -e "\033[34m${text} \033[0m" ;; *) echo "${text} " ;; esac } format_bytes () { local bytes="$1 " local units=("B" "KB" "MB" "GB" "TB" ) local idx=0 while [ "${bytes} " -ge 1024 ] && [ ${idx} -lt 4 ]; do bytes=$((bytes / 1024 )) idx=$((idx + 1 )) done echo "${bytes} ${units[$idx]} " } download_with_pretty_progress () { local url="$1 " local output_file="$2 " local tmp_file="${output_file} .part" local total_size current_size percent filled empty local bar_width=30 local bar total_size=$(curl -fsIL "${url} " | awk 'BEGIN{IGNORECASE=1} /^content-length:/ {gsub("\r", "", $2); print $2; exit}' ) rm -f "${tmp_file} " curl -fL --silent --show-error -o "${tmp_file} " "${url} " & local curl_pid=$! if [[ -n "${total_size} " && "${total_size} " =~ ^[0-9]+$ && "${total_size} " -gt 0 ]]; then while kill -0 "${curl_pid} " 2>/dev/null; do if [ -f "${tmp_file} " ]; then current_size=$(stat -c%s "${tmp_file} " 2>/dev/null || echo 0) else current_size=0 fi percent=$((current_size * 100 / total_size)) [ "${percent} " -gt 100 ] && percent=100 filled=$((percent * bar_width / 100 )) empty=$((bar_width - filled)) bar="$(printf '%*s' "${filled} " '' | tr ' ' '█') $(printf '%*s' "${empty} " '' | tr ' ' '░') " printf "\r[%s] %3d%% (%s/%s)" \ "${bar} " "${percent} " "$(format_bytes "${current_size} " ) " "$(format_bytes "${total_size} " ) " sleep 0.2 done wait "${curl_pid} " local curl_rc=$? if [ ${curl_rc} -ne 0 ]; then printf "\r\033[K" rm -f "${tmp_file} " return 1 fi printf "\r[%s] 100%% (%s/%s)\n" \ "$(printf '%*s' "${bar_width} " '' | tr ' ' '█') " "$(format_bytes "${total_size} " ) " "$(format_bytes "${total_size} " ) " else color "yellow" "无法获取文件总大小,使用简洁下载模式..." wait "${curl_pid} " local curl_rc=$? if [ ${curl_rc} -ne 0 ]; then rm -f "${tmp_file} " return 1 fi fi mv -f "${tmp_file} " "${output_file} " return 0 } if [ -d "${TARGET_DIR} " ] && [ -n "$(ls -A "${TARGET_DIR} " 2>/dev/null) " ]; then echo "" color "yellow" "检测到目标目录非空: ${TARGET_DIR} " read -p "是否清空该目录后继续?(y/n) [n]: " clear_target_choice clear_target_choice=${clear_target_choice:-n} if [[ "${clear_target_choice} " == "y" || "${clear_target_choice} " == "Y" ]]; then find "${TARGET_DIR} " -mindepth 1 -maxdepth 1 -exec rm -rf {} + color "green" "✓ 已清空目标目录: ${TARGET_DIR} " else color "blue" "保留现有文件,继续执行" fi fi mkdir -p "${TARGET_DIR} " cd "${TARGET_DIR} " download_with_fallback () { local file="$1 " local origin_url="${BASE_URL} /${file} " color "blue" "准备下载: ${file} " for proxy in "${PROXY_URLS[@]} " "" ; do local url="${proxy} ${origin_url} " echo "尝试: ${url} " if download_with_pretty_progress "${url} " "${file} " ; then if [ -s "${file} " ]; then if ! file "${file} " | grep -q "HTML" ; then color "green" "✓ ${file} 下载成功" return 0 else rm -f "${file} " fi fi fi done color "red" "✗ ${file} 下载失败" return 1 } echo "" color "yellow" "=== 选择系统架构 ===" echo "1) amd64 (x86_64,默认)" echo "2) arm64 (ARM架构)" read -p "请输入选项 [1-2] (默认: 1): " arch_choicearch_choice=${arch_choice:-1} if [ "${arch_choice} " == "2" ]; then ARCH="arm64" fi color "blue" "选择架构: ${ARCH} " echo "" color "yellow" "=== 选择要使用的网络插件(CNI)===" echo "根据 release 列表,可选以下插件:" echo "1) calico - 功能丰富(推荐)" echo "2) canal - Flannel + Calico 策略" echo "3) cilium - 基于eBPF" echo "4) flannel - 简单 overlay 网络" echo "5) multus - 多网络接口支持" echo "6) 只下载核心包(包含默认的 canal)" echo "7) 全部下载(包含所有 CNI)" read -p "请输入选项 [1-7] (默认: 1): " cni_choicecni_choice=${cni_choice:-1} CNI_PACKAGES=() case $cni_choice in 1) color "blue" "选择: calico" CNI_PACKAGES+=("rke2-images-calico.linux-${ARCH} .tar.zst" ) ;; 2) color "blue" "选择: canal" CNI_PACKAGES+=("rke2-images-canal.linux-${ARCH} .tar.zst" ) ;; 3) color "blue" "选择: cilium" CNI_PACKAGES+=("rke2-images-cilium.linux-${ARCH} .tar.zst" ) ;; 4) color "blue" "选择: flannel" CNI_PACKAGES+=("rke2-images-flannel.linux-${ARCH} .tar.zst" ) ;; 5) color "blue" "选择: multus" CNI_PACKAGES+=("rke2-images-multus.linux-${ARCH} .tar.zst" ) ;; 6) color "blue" "选择: 只下载核心包(默认使用 canal)" CNI_PACKAGES=() ;; 7) color "blue" "选择: 全部下载" CNI_PACKAGES+=( "rke2-images-calico.linux-${ARCH} .tar.zst" "rke2-images-canal.linux-${ARCH} .tar.zst" "rke2-images-cilium.linux-${ARCH} .tar.zst" "rke2-images-flannel.linux-${ARCH} .tar.zst" "rke2-images-multus.linux-${ARCH} .tar.zst" ) ;; *) color "red" "无效选项" exit 1 ;; esac echo "" color "yellow" "=== 可选其他组件 ===" echo "release 列表中还包含以下可选组件:" echo "- vsphere : VMware vSphere 集成" echo "- harvester : Harvester HCI 集成" echo "- traefik : Ingress Controller(默认已包含在核心包中)" read -p "是否需要下载 vSphere 镜像?(y/n) [n]: " vsphere_choiceread -p "是否需要下载 Harvester 镜像?(y/n) [n]: " harvester_choiceread -p "是否需要单独下载 Traefik 镜像?(核心包已包含)(y/n) [n]: " traefik_choiceecho "" color "yellow" "=== 选择下载源 ===" echo "1) 加速地址(默认,失败后自动回退官方地址)" echo "2) 官方地址(仅 GitHub 官方)" read -p "请输入选项 [1-2] (默认: 1): " source_choicesource_choice=${source_choice:-1} case $source_choice in 1) color "blue" "选择: 加速地址 + 官方地址兜底" ;; 2) PROXY_URLS=() color "blue" "选择: 仅使用官方地址(适用外网)" ;; *) color "red" "无效选项" exit 1 ;; esac color "yellow" "\n=== 构建下载文件列表 ===" FILES_TO_DOWNLOAD=( "rke2.linux-${ARCH} .tar.gz" "sha256sum-${ARCH} .txt" ) FILES_TO_DOWNLOAD+=("rke2-images.linux-${ARCH} .tar.zst" ) for pkg in "${CNI_PACKAGES[@]} " ; do FILES_TO_DOWNLOAD+=("${pkg} " ) done if [[ "${vsphere_choice} " == "y" || "${vsphere_choice} " == "Y" ]]; then FILES_TO_DOWNLOAD+=("rke2-images-vsphere.linux-${ARCH} .tar.zst" ) fi if [[ "${harvester_choice} " == "y" || "${harvester_choice} " == "Y" ]]; then FILES_TO_DOWNLOAD+=("rke2-images-harvester.linux-${ARCH} .tar.zst" ) fi if [[ "${traefik_choice} " == "y" || "${traefik_choice} " == "Y" ]]; then FILES_TO_DOWNLOAD+=("rke2-images-traefik.linux-${ARCH} .tar.zst" ) fi unique_files=($(printf "%s\n" "${FILES_TO_DOWNLOAD[@]} " | sort -u)) color "yellow" "\n=== 即将下载以下文件(共 ${#unique_files[@]} 个)===" for file in "${unique_files[@]} " ; do echo " - ${file} " done echo "" read -p "确认开始下载?(y/n) [y]: " confirmconfirm=${confirm:-y} if [[ "${confirm} " != "y" && "${confirm} " != "Y" ]]; then color "red" "下载已取消" exit 0 fi color "yellow" "\n=== 开始下载 ===" FAILED_FILES=() MAX_RETRY_TIMES=2 RETRY_WAIT_SECONDS=5 for file in "${unique_files[@]} " ; do download_with_fallback "${file} " || { if [[ "${file} " == "rke2-images.linux-${ARCH} .tar.zst" ]]; then color "yellow" "完整包下载失败,尝试下载核心包..." download_with_fallback "rke2-images-core.linux-${ARCH} .tar.zst" || { color "yellow" "⚠ ${file} 下载失败,已加入重试列表" FAILED_FILES+=("${file} " ) } else color "yellow" "⚠ ${file} 下载失败,已加入重试列表" FAILED_FILES+=("${file} " ) fi } done if [ ${#FAILED_FILES[@]} -gt 0 ]; then color "yellow" "\n=== 初次下载失败文件(共 ${#FAILED_FILES[@]} 个)===" for file in "${FAILED_FILES[@]} " ; do echo " - ${file} " done for ((retry=1 ; retry<=MAX_RETRY_TIMES; retry++)); do if [ ${#FAILED_FILES[@]} -eq 0 ]; then break fi color "yellow" "\n=== 第 ${retry} /${MAX_RETRY_TIMES} 次重试(等待 ${RETRY_WAIT_SECONDS} s)===" sleep "${RETRY_WAIT_SECONDS} " RETRY_FAILED_FILES=() for file in "${FAILED_FILES[@]} " ; do download_with_fallback "${file} " || { if [[ "${file} " == "rke2-images.linux-${ARCH} .tar.zst" ]]; then color "yellow" "完整包下载失败,尝试下载核心包..." download_with_fallback "rke2-images-core.linux-${ARCH} .tar.zst" || { RETRY_FAILED_FILES+=("${file} " ) } else RETRY_FAILED_FILES+=("${file} " ) fi } done FAILED_FILES=("${RETRY_FAILED_FILES[@]} " ) done if [ ${#FAILED_FILES[@]} -gt 0 ]; then color "yellow" "\n⚠ 重试后仍下载失败的文件(共 ${#FAILED_FILES[@]} 个):" for file in "${FAILED_FILES[@]} " ; do echo " - ${file} " done else color "green" "\n✓ 所有初次失败文件已在重试阶段下载成功" fi fi color "yellow" "\n=== 下载安装脚本 ===" if download_with_pretty_progress "${INSTALL_SCRIPT_URL} " "install.sh" ; then chmod +x install.sh color "green" "✓ install.sh 下载成功" else color "red" "❌ install.sh 下载失败" exit 1 fi color "yellow" "\n=== 校验下载文件 ===" if [ -f "sha256sum-${ARCH} .txt" ]; then sha256sum -c "sha256sum-${ARCH} .txt" --ignore-missing || { color "yellow" "⚠ 部分文件校验失败,但不影响使用" } fi color "green" "\n✅ 离线文件准备完成!保存在: ${TARGET_DIR} " echo "===========================================" ls -lh "${TARGET_DIR} " | awk '{print $9 " (" $5 ")"}' echo "===========================================" color "yellow" "\n📋 后续步骤:" echo "1. 将镜像包复制到所有节点的 /var/lib/rancher/rke2/agent/images/" echo "2. 在各节点执行安装脚本: INSTALL_RKE2_ARTIFACT_PATH=${TARGET_DIR} sh ${TARGET_DIR} /install.sh" if [ ${#CNI_PACKAGES[@]} -gt 0 ]; then case $cni_choice in 1) echo "3. 在 config.yaml 中配置: cni: calico" ;; 2) echo "3. 在 config.yaml 中配置: cni: canal" ;; 3) echo "3. 在 config.yaml 中配置: cni: cilium" ;; 4) echo "3. 在 config.yaml 中配置: cni: flannel" ;; 5) echo "3. 在 config.yaml 中配置: cni: multus" ;; 7) echo "3. 在 config.yaml 中根据需求配置 cni 选项" ;; esac else echo "3. 在 config.yaml 中可以不配置 cni(默认使用 canal)" fi EOF2 bash rke2-source-download.sh
分发离线包到所有节点
前提:节点间 ssh 可达(你前文已经有免密配置)。
操作节点:master1
1 2 3 4 5 6 7 8 for host in master2 master3 node01; do ssh root@${host} "mkdir -p /root/rke2-artifacts" scp -r /root/rke2-artifacts/* root@${host} :/root/rke2-artifacts/ done ssh root@master2 "ls -lh /root/rke2-artifacts"
所有节点放置离线镜像包
操作节点:所有节点
1 2 3 mkdir -p /var/lib/rancher/rke2/agent/imagescp /root/rke2-artifacts/rke2-images*.linux-amd64.tar.zst /var/lib/rancher/rke2/agent/images/ls -lh /var/lib/rancher/rke2/agent/images/
如果你只下载了 core + calico 这种拆分包,也没问题,放进去即可。
在首台控制面安装 RKE2 Server(master1)
操作节点:master1
先创建配置目录与配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 mkdir -p /etc/rancher/rke2cat > /etc/rancher/rke2/config.yaml <<'EOF' token: "123456" tls-san: - "192.168.48.10" cni: "calico" kube-proxy-arg: - "proxy-mode=ipvs" - "ipvs-strict-arp=true" EOF
安装并启动:
1 2 3 4 INSTALL_RKE2_ARTIFACT_PATH=/root/rke2-artifacts sh /root/rke2-artifacts/install.sh && systemctl enable rke2-server.service --now journalctl -u rke2-server -f
首次初始化后,获取 join token:
1 2 cat /var/lib/rancher/rke2/server/node-token
配置 kubect、crictl、ctr工具
操作节点:[所有的master节点]
等所有的master节点的rke2-server 都启动成功了,再运行下面的命令吧
因为下面有些文件需要等运行成功才会生成,没启动成功直接运行会报错的
存储在/etc/rancher/rke2/rke2.yaml的 kubeconfig 文件用来配置对 Kubernetes 集群的访问,且会有一些工具如(containerd,containerd-shim-runc-v2,crictl,ctr kubectl,kubelet,runc)存放在/var/lib/rancher/rke2/bin
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 mkdir -p ~/.kubecp /etc/rancher/rke2/rke2.yaml ~/.kube/configchmod 600 ~/.kube/configcat > /etc/crictl.yaml <<'EOF' runtime-endpoint: unix:///run/k3s/containerd/containerd.sock image-endpoint: unix:///run/k3s/containerd/containerd.sock timeout : 10debug: false EOF grep -q 'export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock' ~/.bashrc || echo 'export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock' >> ~/.bashrc grep -q 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' ~/.bashrc || echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc grep -q 'export PATH=\$PATH:/var/lib/rancher/rke2/bin' ~/.bashrc || echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc source ~/.bashrcapt install bash-completion -y grep -q 'source /usr/share/bash-completion/bash_completion' ~/.bashrc || echo 'source /usr/share/bash-completion/bash_completion' >> ~/.bashrc grep -q 'source <(kubectl completion bash)' ~/.bashrc || echo 'source <(kubectl completion bash)' >> ~/.bashrc source ~/.bashrc
验证:
1 2 3 kubectl get pod -A crictl images ls ctr -n k8s.io images ls
(可选)增加另外两台控制面(master2、master3)
操作节点:master2、master3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 mkdir -p /etc/rancher/rke2cat > /etc/rancher/rke2/config.yaml <<'EOF' server: https://192.168.48.10:9345 token: "123456" cni: "calico" kube-proxy-arg: - "proxy-mode=ipvs" - "ipvs-strict-arp=true" EOF INSTALL_RKE2_ARTIFACT_PATH=/root/rke2-artifacts sh /root/rke2-artifacts/install.sh systemctl enable rke2-server --now
回到 master1 检查:
增加工作节点(node01)
操作节点:node01
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 mkdir -p /etc/rancher/rke2cat > /etc/rancher/rke2/config.yaml <<'EOF' server: https://192.168.48.10:9345 token: "123456" kube-proxy-arg: - "proxy-mode=ipvs" - "ipvs-strict-arp=true" EOF INSTALL_RKE2_ARTIFACT_PATH=/root/rke2-artifacts \ INSTALL_RKE2_TYPE="agent" sh /root/rke2-artifacts/install.sh && systemctl enable rke2-agent --now journalctl -u rke2-agent -f
在 master1 校验:
1 kubectl get nodes -o wide
node节点可以保留crictl和ctr的功能,kubectl就不用了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 cat > /etc/crictl.yaml <<'EOF' runtime-endpoint: unix:///run/k3s/containerd/containerd.sock image-endpoint: unix:///run/k3s/containerd/containerd.sock timeout : 10debug: false EOF grep -q 'export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock' ~/.bashrc || echo 'export CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock' >> ~/.bashrc grep -q 'export PATH=\$PATH:/var/lib/rancher/rke2/bin' ~/.bashrc || echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc source ~/.bashrc
验证:
1 2 crictl images ls ctr -n k8s.io images ls
卸载 RKE2(按需)
1 2 3 4 5 rke2-uninstall.sh rke2-agent-uninstall.sh
RKE问题
1.rke2部署问题
首先rke2部署巨慢,我也搞不清楚是什么问题,内部containerd下载镜像慢吧,但是配置了加速地址也不行,开梯子也不行,还是慢,只能说启动失败,再启动一次就行了。
其次,你加入了多少个节点,就得等所有节点加进来之后,才会触发网络插件,我配置文件写的是calico,等节点加进来之后,就是等各个节点镜像下载好之后,才会触发calico的pod的启动,等这个calico好了,才会变成ready
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 root@master1:~# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 NotReady control-plane,etcd 24m v1.35.1+rke2r1 master2 NotReady control-plane,etcd 21m v1.35.1+rke2r1 master3 NotReady control-plane,etcd 21m v1.35.1+rke2r1 node01 NotReady <none> 23m v1.35.1+rke2r1 root@master1:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE calico-system calico-kube-controllers-546b4578b8-x5t89 0/1 Pending 0 4m57s calico-system calico-node-288mf 0/1 Init:1/3 0 4m57s calico-system calico-node-82tnk 0/1 Init:1/3 0 4m57s calico-system calico-node-8g8wv 0/1 Init:1/3 0 4m57s calico-system calico-node-ht8bf 0/1 Init:1/3 0 4m57s calico-system calico-typha-6654d87cc4-tmzpw 0/1 Pending 0 4m48s calico-system calico-typha-6654d87cc4-ww6xw 0/1 Pending 0 4m57s
镜像下载也要很久,你可以手动去看看需要下载什么镜像,然后到各个节点去下载
1 2 3 4 5 6 root@master1:~/rke2-artifacts# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,etcd 63m v1.35.1+rke2r1 master2 Ready control-plane,etcd 59m v1.35.1+rke2r1 master3 Ready control-plane,etcd 59m v1.35.1+rke2r1 node01 Ready <none> 62m v1.35.1+rke2r1
rancher可视化界面
操作节点:[master1]
这只是测试环境建议用docker方式生产的话不建议用docker,具体的可以看看这篇文章
https://blog.qianyios.top/posts/e650e0d0
官方文档:https://ranchermanager.docs.rancher.com/zh/getting-started/installation-and-upgrade/other-installation-methods/rancher-on-a-single-node-with-docker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 docker rm -f qianyios_rancher &> /dev/null docker stop qianyios_rancher &> /dev/null && docker rm qianyios_rancher &> /dev/null rm -rf /data/rancher_homemkdir -p /data/rancher_home/ranchermkdir -p /data/rancher_home/auditlogmkdir -p /data/rancher_home/rancher/k3s/agent/images/docker run --rm --entrypoint "" -v $(pwd ):/output \ docker.cnb.cool/qianyios/qyrepo/rancher-rancher:latest \ cp /var/lib/rancher/k3s/agent/images/k3s-airgap-images.tar /output/ mv k3s-airgap-images.tar /data/rancher_home/rancher/k3s/agent/images/docker run -d --privileged --restart=unless-stopped \ -p 80:80 -p 443:443 \ -e TZ=Asia/Shanghai \ -e CATTLE_SYSTEM_CATALOG=bundled \ -v /data/rancher_home/rancher:/var/lib/rancher \ -v /data/rancher_home/auditlog:/var/log/auditlog \ --name qianyios_rancher docker.cnb.cool/qianyios/qyrepo/rancher-rancher:latest docker logs -f qianyios_rancher
随后通过浏览器访问页面。由于 Rancher 部署在 master1 节点上,服务启动需要一定时间,页面可能不会立即可访问,请耐心等待数分钟,直至页面正常加载完成。
https://192.168.48.10
获取页面密码,然后填进页面即可
1 docker logs qianyios_rancher 2>&1 | grep "Bootstrap Password:"
然后就要设置新密码。你自己安排
我们现在要添加集群
这时候我们rke部署的k8s就会有一个pod,他需要下载镜像,会下的很慢
1 2 POD_NAME=$(kubectl get pod -n cattle-system | grep cattle-cluster-agent | awk '{print $1}' ) kubectl describe pod -n cattle-system $POD_NAME
这时候你看看我这个文章
https://blog.qianyios.top/posts/69efb119/
镜像同步好,我的版本和上面的是对应的,如果你的不是这个版本,你要自己同步了
1 2 docker pull docker.cnb.cool/qianyios/qyrepo/rancher-rancher-agent:v2.13.2 docker tag docker.cnb.cool/qianyios/qyrepo/rancher-rancher-agent:v2.13.2 rancher/rancher-agent:v2.13.2
然后就可以正常启动了
千屹博客旗下的所有文章,是通过本人课堂学习和课外自学所精心整理的知识巨著 难免会有出错的地方 如果细心的你发现了小失误,可以在下方评论区告诉我,或者私信我! 非常感谢大家的热烈支持!