PVE 主机依赖虚拟机作 DHCP 导致主机无法启动
在玩耍 使用 MACVLAN 为主机同时分配静态 IP 并取得动态 IP 的过程中,一个重启,发现主机又连不上了 😓
用 Rescue Mode 启动,系统日志卡在 Job networking.service/start running 阶段,就像这两个的错误表现(但原因不同)。
- [SOLVED] - Job networking.service/start running (12min / no limit) | Proxmox Support Forum
- [SOLVED] PVE Host stuck on “Job networking.service/start running” | Proxmox Support Forum
虽然可以 systemctl disable systemd-networkd-wait-online 避免卡在这一处,但没有网络也连不上主机,没意义。
而且日志中有 apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=968 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none 的信息,一直误导我以为和 apparmor 在 dhclient 的权限设置有关,在互联网搜索说和升级 / NTP 有关,但我都没改过肯定不是这原因。
- Proxmox 8.0 Hang on Boot - A Fix
- Proxmox 9.0.4 doesn’t get IP by DHCP | Proxmox Support Forum
- networking - Why does apparmor kill dhclient? - Ask Ubuntu
用 Live CD 连上去 chroot 看详细日志就比较清楚了,是网络设备 vmbr1 没有 DHCP 到 IP。
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4146] ifupdown: management mode: unmanaged
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4146] ifupdown: interface-parser: parsing file /etc/network/interfaces
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4148] ifupdown: interface-parser: source line includes interfaces file(s) /etc/network/interfaces.d/*
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4148] ifupdown: interface-parser: parsing file /etc/network/interfaces.d/mgmt
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4149] ifupdown: interface-parser: finished parsing file /etc/network/interfaces.d/mgmt
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4150] ifupdown: interface-parser: parsing file /etc/network/interfaces.d/mgmt~
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4150] ifupdown: interface-parser: finished parsing file /etc/network/interfaces.d/mgmt~
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4151] ifupdown: interface-parser: parsing file /etc/network/interfaces.d/sdn
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4151] ifupdown: interface-parser: finished parsing file /etc/network/interfaces.d/sdn
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4151] ifupdown: interface-parser: finished parsing file /etc/network/interfaces
11月 18 18:14:14 pve systemd[1]: Starting NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service...
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4167] ifupdown: guessed connection type (enp5s0) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4175] ifupdown: guessed connection type (vmbr0) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4181] ifupdown: guessed connection type (vmbr0) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4182] ifupdown: guessed connection type (vmbr1) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4182] ifupdown: guessed connection type (vmbr1) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4182] ifupdown: guessed connection type (mgmt) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4182] ifupdown: guessed connection type (mgmt) = 802-3-ethernet
11月 18 18:14:14 pve NetworkManager[840]: <info> [1763460854.4201] dhcp: init: Using DHCP client 'internal'
---8<---
11月 18 18:14:14 pve networking[891]: networking: Configuring network interfaces
11月 18 18:14:15 pve NetworkManager[840]: <info> [1763460855.2847] manager: (vmbr0): new Bridge device (/org/freedesktop/NetworkManager/Devices/3)
11月 18 18:14:15 pve kernel: vmbr0: port 1(enp5s0) entered blocking state
11月 18 18:14:15 pve kernel: vmbr0: port 1(enp5s0) entered disabled state
11月 18 18:14:15 pve kernel: alx 0000:05:00.0 enp5s0: entered allmulticast mode
11月 18 18:14:15 pve kernel: alx 0000:05:00.0 enp5s0: entered promiscuous mode
11月 18 18:14:15 pve kernel: alx 0000:05:00.0 enp5s0: NIC Up: 1 Gbps Full
11月 18 18:14:15 pve NetworkManager[840]: <info> [1763460855.2940] device (enp5s0): carrier: link connected
11月 18 18:14:15 pve kernel: vmbr0: port 1(enp5s0) entered blocking state
11月 18 18:14:15 pve kernel: vmbr0: port 1(enp5s0) entered forwarding state
11月 18 18:14:15 pve NetworkManager[840]: <info> [1763460855.2983] device (vmbr0): carrier: link connected
11月 18 18:14:15 pve NetworkManager[840]: <info> [1763460855.3090] manager: (vmbr1): new Bridge device (/org/freedesktop/NetworkManager/Devices/4)
11月 18 18:14:15 pve info[912]: vmbr1: enabling syslog for dhcp configuration
11月 18 18:14:15 pve info[912]: executing ip -o addr show vmbr1
11月 18 18:14:16 pve info[912]: executing /sbin/dhclient -pf /run/dhclient.vmbr1.pid -lf /var/lib/dhcp/dhclient.vmbr1.leases vmbr1
11月 18 18:14:16 pve kernel: kauditd_printk_skb: 112 callbacks suppressed
11月 18 18:14:16 pve kernel: audit: type=1400 audit(1763460856.317:123): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=968 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:14:16 pve kernel: audit: type=1400 audit(1763460856.324:124): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=969 comm="dhclient" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
11月 18 18:14:16 pve kernel: audit: type=1400 audit(1763460856.324:125): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=969 comm="dhclient" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
11月 18 18:14:16 pve kernel: audit: type=1400 audit(1763460856.353:126): apparmor="DENIED" operation="capable" class="cap" profile="/{,usr/}sbin/dhclient" pid=969 comm="dhclient" capability=21 capname="sys_admin"
11月 18 18:14:16 pve kernel: audit: type=1400 audit(1763460856.353:127): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=969 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:14:19 pve NetworkManager[840]: <info> [1763460859.0191] failed to open /run/network/ifstate
11月 18 18:14:19 pve (plymouth)[832]: rescue.service: Unable to locate executable 'plymouth': No such file or directory
11月 18 18:14:23 pve kernel: audit: type=1400 audit(1763460863.519:128): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=969 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:14:24 pve systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
11月 18 18:14:31 pve kernel: audit: type=1400 audit(1763460871.361:129): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=969 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:14:39 pve systemd[1]: Reload requested from client PID 976 ('systemd-sulogin') (unit rescue.service)...
11月 18 18:14:39 pve systemd[1]: Reloading...
---8<---
11月 18 18:15:17 pve info[912]: executing ip -o addr show vmbr1
11月 18 18:15:17 pve networking[912]: error: vmbr1: dhclient: timeout failed to detect new ip addresses
11月 18 18:15:17 pve /usr/sbin/ifup[912]: error: vmbr1: dhclient: timeout failed to detect new ip addresses
11月 18 18:15:17 pve error[912]: vmbr1: dhclient: timeout failed to detect new ip addresses
11月 18 18:15:17 pve info[912]: executing /sbin/dhclient -6 -x -pf /run/dhclient6.vmbr1.pid -lf /var/lib/dhcp/dhclient6.vmbr1.leases vmbr1
11月 18 18:15:17 pve kernel: audit: type=1400 audit(1763460917.120:134): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1186 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:17 pve kernel: audit: type=1400 audit(1763460917.121:135): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1187 comm="dhclient" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:17 pve kernel: audit: type=1400 audit(1763460917.121:136): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1187 comm="dhclient" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:17 pve kernel: audit: type=1400 audit(1763460917.124:137): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1187 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:17 pve networking[1191]: Error: ipv6: address not found.
11月 18 18:15:18 pve info[912]: executing /bin/ip -6 addr show vmbr1
11月 18 18:15:18 pve info[912]: executing /sbin/dhclient -6 -pf /run/dhclient6.vmbr1.pid -lf /var/lib/dhcp/dhclient6.vmbr1.leases vmbr1
11月 18 18:15:18 pve kernel: audit: type=1400 audit(1763460918.140:138): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1195 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:18 pve kernel: audit: type=1400 audit(1763460918.140:139): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1196 comm="dhclient" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:18 pve kernel: audit: type=1400 audit(1763460918.140:140): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1196 comm="dhclient" family="unix" sock_type="stream" protocol=0 requested="create" denied="create" addr=none
11月 18 18:15:18 pve kernel: audit: type=1400 audit(1763460918.155:141): apparmor="DENIED" operation="create" class="net" info="failed protocol match" error=-13 profile="/{,usr/}sbin/dhclient" pid=1196 comm="dhclient" family="unix" sock_type="dgram" protocol=0 requested="create" denied="create" addr=none来看这次的网络配置文件(重点部分),这条 vmbr1 是用来给虚拟机们互相连接的,另一端接到 OpenWrt 上,见 给 Docker 配置完整的 IPv6 macvlan 网络环境 架构图。
而 OpenWrt 作为虚拟机,肯定得在 PVE 主机启动后才能启动,也就会导致 DHCP 一定无法成功,就卡在这里了。
auto vmbr0
iface vmbr0 inet dhcp
iface vmbr0 inet static
address 192.168.9.17/24
gateway 192.168.9.1
bridge-ports enp5s0
bridge-stp off
bridge-fd 0
iface vmbr0 inet6 auto
auto vmbr1
iface vmbr1 inet dhcp
bridge-ports none
bridge-stp off
bridge-fd 0
iface vmbr1 inet6 auto解决方法是,写成静态地址(iface vmbr1 inet static)或者干脆不要让 PVE 主机注册到 vmbr1 子网中(iface vmbr1 inet manual)。毕竟在我的网络架构中,vmbr1 的主机可以通过 OpenWrt 访问到 PVE 所在的 vmbr0。