Are pipeline stages/containers supposed to have multiple nics?

I just transitioned the agent from one host to another (beefier) after about a year on the previous, and am getting odd network flakiness issues. The repo-under-test has a 2hr timeout and the test stage is running a vm under qemu. Luckily this stage is not very clever and will just hang around waiting to be killed by timeout if the vm doesn’t poweroff, so great for debugging odd network related issues.

Well my issue seems to stem from creating a dummy bridge in the container for qemu’s use and hard code the ip it has. The problem is that the ip I hard code sometimes happens to be on eth0’s subnet and also happens to be the default gw which leads to problems.

This is a nic that seems to be setup by drone. Is this new/on purpose? I can’t recall this being the case before and I’m not seeing much in the docs.

Here’s how things end up when we have problems:

agent host

$ ip addr show docker0
6: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:a6:11:fa:f6 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet 10.42.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:a6ff:fe11:faf6/64 scope link 
       valid_lft forever preferred_lft forever

stuck container

# ip route; ip addr show eth0; ip addr show eth1; ip addr show hv
default via 172.18.0.1 dev eth0 
172.17.0.0/16 dev eth1 scope link  src 172.17.0.5 
172.18.0.0/24 dev hv scope link  src 172.18.0.1 
172.18.0.0/16 dev eth0 scope link  src 172.18.0.2 
345: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.2/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe12:2/64 scope link 
       valid_lft forever preferred_lft forever
343: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 02:42:ac:11:00:05 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.5/16 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:5/64 scope link 
       valid_lft forever preferred_lft forever
2: hv: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 36:7c:2a:64:6b:86 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/24 scope global hv
       valid_lft forever preferred_lft forever
    inet6 fe80::dccc:19ff:febc:33d9/64 scope link 
       valid_lft forever preferred_lft forever

testing hunch

# ping -c1 -W1 8.8.8.8; ip addr del 172.18.0.1/24 dev hv; ping -c1 -W1 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=122 time=993.446 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 993.446/993.446/993.446 ms

Now I cad definitely be smarter about dynamically choosing a subnet for hv, but I’m curious to find out if something is misconfigured on this new drone agent (should not be the case) and/or why I never saw this issue on the old agent.