Any general guidelines for EC2 instance sizing for drone server? Ballpark usage is maybe a dozen or so concurrent jobs. Jobs tend to only run during normal business hours, so I’m thinking a t2 could be appropriate?
Also we’re looking to use Amazon RDS for the backing DB, which also requires us to pick an instance size. Could I go with a small size for that too? Not sure how much drone is database-bound.
not recommend small size for build servers(drone agents), but it is fine to use small size for drone server and recommend to use rds for its database, drone server itself doesn’t use much memory and cpu . you do need think about to use spot instances if you are fine that if prod deployment can be stopped randomly.
In our production environment, RDS size currently is
db.t2.medium with most business time lower than 2% CPU usage. We run about 300 builds every week, desired drone agents is 6, can be scaled to 20,
drone_max_procs is 3. So we have min 18 agents available.
for non-prod env, we set all to spots only
secondly set schedule job to manage the autoscaling group to 0 at night if no required nightly build.
with both ways, you can save a lot of cost still with big instance size (Pay as medium, but used as xlarge, 2xlarge instances) . That also means you get better performance and better feeling to use Drone, otherwise, the developers will come to complain it a lot. Which is not Drone’s fault!
Any general guidelines for EC2 instance sizing for drone server? Ballpark usage is maybe a dozen or so concurrent jobs.
The drone server has very modest memory and cpu requirements, and can usually be run on the smallest instance type. You should be able to start with the t2.nano. Check the instance usage during peak hours and upgrade if needed.
The drone agent also have very modest memory and cpu requirements. Estimating the size of your agent instance (and how many instances) depends on the workloads you are running. A single Java or C++ project could max out CPU and require 4GB or ram to compile, compared to a node project that could have heavy I/O and network requirements when performing an npm install.
I generally recommend using multiple smaller machines, as opposed to 1 large machine. This will prevent noisy neighbor issues, so to speak. Years ago, when we had a Drone saas offering, there was a node project that took 2 minutes to execute and a C++ project that took 15 minutes. One day, the two projects had builds that were scheduled to run on the same server, at the same time. The node project took 17 minutes to execute because it was competing for resources with the C++ project. We received a number of complaints from our users, and decided we would only execute a single build on a machine at a time.
Of course ymmv, and it really depends on your workloads and end user expectations.
Also we’re looking to use Amazon RDS for the backing DB, which also requires us to pick an instance size.
Yes, this should be fine and can always be increased if you need more storage. In general drone is conservative with the amount of data that it stores. I would guess that user, repository and build information will take up tens of megabytes. The build logs, on the other hand, are stored in the database and can grow to gigabytes over time depending on number of builds and verbosity of build output. I would expect this to impact disk size more than anything.
This is exactly what I needed. Thanks so much for the detailed response!
@bradrydzewski One follow-up to this: Is there a way to set a policy to purge old logs?
There are no immediate plans for an official purge routine, however, one could schedule a database query to periodically purge old data. A possible solution is described here https://github.com/drone/docs/issues/238