I’m looking to create a build farm backed by drone. With a combination of GKE autoscaler and node taints / tolerations I can autoscale the build pool down to 0, and use only pre-emptable builds.
However, at this time I can’t see any support for tolerations in Drone. There’s node selector, but it doesn’t really work the same way. Specifically, we can’t prevent execution on nodes with NodeSelector that will allow the pool to autoscale back to 0; we can with Taints and Tolerations
I can implement this with a mutating webhook. But is this something else already seen here? Is it possible to add a patch for support in core?
Sorry, there is no support for taints at the moment, however there is an open issue that you can subscribe to https://github.com/drone/drone-runtime/issues/37.
However, if you are building out a production build farm, I recommend using agents on traditional instances with our autoscaler. The Kubernetes runtime is still experimental and is not recommended for production use.
In terms of adding support for taints (I really appreciate the offer) I think we need to decide whether or not we will keep investing in the current experiment. I am increasingly convinced that we need to scrap the current experiment and try something different. See the following threads for some context:
Nicee; I appreciate the need to find well designed primitives.
I will think about this more.
In our case the kubernetes primitives for autoscaling are appealing as it’s a single idea we can apply across our stack. Drone workloads are similar to (for example) application specific transient job (email, import, export and so fourth) and it’d be good to only keep the one in mind.
However, I respect that Drone is not designed to only solve Kubernetes problems – I quite like the idea of a pipeline abstraction, and knative seems like a good runner for this stuff.
TL, DR – Sounds good!