Helm deployment of Drone runner docker is unstable


as the drone-runner-kube has been deprecated we switched to drone-runner-docker that we are running in an EKS cluster. For the deployment we use your official helm chart

helm upgrade drone drone/drone -f ./drone/server.values.yaml --namespace drone --version 0.6.4 --install
helm upgrade drone-runner-docker drone/drone-runner-docker -f ./drone/runner.values.yaml --namespace drone --version 0.6.1 --install
helm upgrade drone-kubernetes-secrets drone/drone-kubernetes-secrets -f ./drone/kubernetes-secrets.values.yaml --namespace drone --version 0.1.4 --install

The first thing we noticed it that it uses much more resources than the kube-runner (the dind container uses sometimes more than 2GB of RAM and the CPU usages goes up to more than 3500). We are running 3 runner pods each on a t3a.xlarge instance (4 CPUs and 16 GB of RAM).

Here is the problem:
The drone-runner-docker container crashes quite a lot and the logs say

drone-runner-docker time="2023-01-15T02:40:20Z" level=info msg="starting the server" addr=":3000"
drone-runner-docker "cannot ping the remote server" error="Post \"http://drone.drone.svc.cluster.local:8080/rpc/v2/ping\": dial tcp: i/o timeout"
received signal, terminating process

and in the Kubernetes events we see

Readiness probe failed: dial tcp connect: connection refused

The IP belongs to the drone-runner-docker container. So, it runs just fine for some time and then it crashes like 50 times in a row and starts to work again. Please find some Grafana graphs attached. The crashes seem to occur randomly as they are not correlated to a high workload.

Does somebody know how to fix it?

Best wishes