[solved] Builds stuck in pending state for drone:1.1.0

I have deployed drone on docker swarm with following configuration.

version: ‘3.5’

networks:
drone:
driver: overlay
attachable: true
traefik_public:
external: true
driver: overlay

volumes:
drone-data:
external: true

services:

drone-server:
image: “drone/drone:1.1.0”
logging:
driver: json-file
options:
max-size: “10m”
max-file: “3”
networks:
- traefik_public
- drone
environment:
- DRONE_SERVER_HOST=drone.domain.in
- DRONE_SERVER_PROTO=https
- DRONE_TLS_AUTOCERT=false
- DRONE_AGENTS_ENABLED=true
- DRONE_BITBUCKET_CLIENT_ID=xxxxx
- DRONE_BITBUCKET_CLIENT_SECRET=xxxxxx
- DRONE_RPC_SECRET=d859a7aa9453f47ff567939024b0f7c2
- DRONE_DATABASE_DRIVER=mysql
- DRONE_DATABASE_DATASOURCE=xxxx:[email protected](xx.xx.xx.xx:3306)/drone?parseTime=true
- DRONE_LOGS_DEBUG=true
- DRONE_USER_CREATE=cccccc:octocat,machine:false,admin:true,token:55f24eb3d61ef6ac5e83d550178638dc
volumes:
- drone-data:/var/lib/drone/
deploy:
placement:
constraints:
- node.role==manager
replicas: 1
labels:
- “traefik.backend=drone-server”
- “traefik.frontend.rule=Host:drone.domain.in”
- “traefik.frontend.entryPoints=http”
- “traefik.port=80”
- “traefik.docker.network=traefik_public”
- “traefik.enable=true”
endpoint_mode: dnsrr
mode: replicated

drone-agent:
image: “drone/agent:1.1.0”
logging:
driver: json-file
options:
max-size: “10m”
max-file: “3”
networks:
- drone
environment:
- DRONE_PLUGIN_PULL=true
- DRONE_LOGS_DEBUG=true
- DRONE_RPC_SERVER=http://drone-server
- DRONE_RPC_SECRET=d859a7aa9453f47ff567939024b0f7c2
- DRONE_RUNNER_CAPACITY=3
depends_on:
- drone-server
volumes:
- /var/run/docker.sock:/var/run/docker.sock
deploy:
replicas: 3
mode: replicated

Agent Logs :

{“arch”:“amd64”,“level”:“debug”,“machine”:“002aa291582e”,“msg”:“runner: polling queue”,“os”:“linux”,“time”:“2019-04-28T19:43:13Z”}
{“arch”:“amd64”,“level”:“debug”,“machine”:“002aa291582e”,“msg”:“runner: polling queue”,“os”:“linux”,“time”:“2019-04-28T19:43:14Z”}

Server Logs:

{“level”:“info”,“msg”:“main: internal scheduler enabled”,“time”:“2019-04-28T19:42:04Z”}
{“build.limit”:15000,“expires”:“0001-01-01T00:00:00Z”,“kind”:“trial”,“level”:“debug”,“msg”:“main: license loaded”,“repo.limit”:0,“time”:“2019-04-28T19:42:04Z”,“user.limit”:0}
{“acme”:false,“host”:“drone.domain.in”,“level”:“info”,“msg”:“starting the http server”,“port”:":80",“proto”:“https”,“time”:“2019-04-28T19:42:04Z”,“url”:“https://drone.domain.in”}
{“interval”:“30m0s”,“level”:“info”,“msg”:“starting the cron scheduler”,“time”:“2019-04-28T19:42:04Z”}

The most common root cause for builds stuck in pending is agent-to-server connectivity issues. You can use the following guide to triage connectivity issues, which also describes the information that is required to provide support: Builds are Stuck in Pending Status.

However, since you are using Docker Compose and you are presumably installing everything on the same server, I should point out that you do not need to use agents at all. Instead you can configure the server to run without agents, in single-machine mode, as described in the docs at https://docs.drone.io/installation/bitbucket-cloud/single-machine/

I am using docker stack, as i have a docker swarm cluster of 12 machines. So my Drone server is running on one of the master and I allocated 3 worker nodes for drone agent and distributed in my cluster. I debugged it by attaching a container on same network(overlay) as drone system is running , i am able to ping as well as telnet to port 80 of my drone server using service name. I don’t know why agent are not able to pick build as per agent logs they are polling the server queue for getting build instructions.

{“arch”:“amd64”,“level”:“debug”,“machine”:“002aa291582e”,“msg”:“runner: polling queue”,“os”:“linux”,“time”:“2019-04-28T19:43:13Z”}

This is not a single machine deployment. Indeed i am using docker-compose but with deploy instruction which work in swarm cluster. Earlier i was using version of drone 0.8.0 . Which is working flawlessly. But as i upgrade to 1.1.0 it ran into problem. My whole repo build pipeline. Now I am stuck with upgrade.

The link I mentioned above describes how to debug, and the information requires to assist. See Builds are Stuck in Pending Status

Hi bradrydzewski,

I have gone through your suggested link and enabled trace logging on both server and agents. And find that server is miss reconfigured somehow. I am running drone setup in docker swarm cluster. I have a url configured to send request to drone server in following fashion

“web-url —> nginx gateway (only https enabled) ----> traefik gateway (swarm cluster gateway —> drone server container”.

Let say i have a url drone.domain.com. So i configured

  • DRONE_SERVER_HOST=https://drone.domain.in (i was hopping this would be host url where drone server is listening from web browser). As i am running this setup in container, so do i have to provide container hostname (which i can’t as it is dynamic ). Please clarify it
  • DRONE_SERVER_PROTO=http

server configuration is as follow:

environment:
- DRONE_SERVER_HOST=drone.domain.in
- DRONE_SERVER_PROTO=https
- DRONE_TLS_AUTOCERT=false
- DRONE_AGENTS_ENABLED=true
- DRONE_BITBUCKET_CLIENT_ID=xxxxxxxxxxxxxxxxxxxx
- DRONE_BITBUCKET_CLIENT_SECRET=xxxxxxxxxxxxxxxxxxx
- DRONE_RPC_SECRET=yyyyyyyyyyyyyyyyyyyyyyyy
- DRONE_DATABASE_DRIVER=mysql
- DRONE_DATABASE_DATASOURCE=drone:[email protected](xxx.xxx.xxx.xxx:3306)/drone
- DRONE_LOGS_TRACE=true
- DRONE_USER_CREATE=goelprateek:octocat,machine:false,admin:true,token:55f24eb3d61ef6ac5e83d550178638dc
- HTTPS_PROXY=http://xxx.xx.xxx.xxx:3128
- HTTP_PROXY=http://xxx.xx.xxx.xxx:3128

I hereby attaching logs from server

[email protected] | license: “”
[email protected] | authn:
[email protected] | endpoint: “”
[email protected] | token: “”
[email protected] | skipverify: false
[email protected] | agent:
[email protected] | enabled: true
[email protected] | cron:
[email protected] | disabled: false
[email protected] | interval: 30m0s
[email protected] | cloning:
[email protected] | alwaysauth: false
[email protected] | username: “”
[email protected] | password: “”
[email protected] | image: “”
[email protected] | pull: IfNotExists
[email protected] | database:
[email protected] | driver: mysql
[email protected] | datasource: drone:[email protected](xxx.xx.xx.xx:3306)/drone?parseTime=true
[email protected] | secret: “”
[email protected] | datadog:
[email protected] | enabled: true
[email protected] | endpoint: https://stats.drone.ci/api/v1/series
[email protected] | token: “”
[email protected] | docker:
[email protected] | config: “”
[email protected] | http:
[email protected] | allowedhosts: []
[email protected] | hostsproxyheaders: []
[email protected] | sslredirect: false
[email protected] | ssltemporaryredirect: false
[email protected] | sslhost: “”
[email protected] | sslproxyheaders: {}
[email protected] | stsseconds: 0
[email protected] | stsincludesubdomains: false
[email protected] | stspreload: false
[email protected] | forcestsheader: false
[email protected] | browserxssfilter: true
[email protected] | framedeny: true
[email protected] | contenttypenosniff: false
[email protected] | contentsecuritypolicy: “”
[email protected] | referrerpolicy: “”
[email protected] | jsonnet:
dr[email protected] | enabled: false
[email protected] | logging:
[email protected] | debug: false
[email protected] | trace: true
[email protected] | color: false
[email protected] | pretty: false
[email protected] | text: false
[email protected] | proxy:
[email protected] | addr: https://drone.domain.in
[email protected] | host: drone.domain.in
[email protected] | proto: https
[email protected] | registration:
[email protected] | closed: false
[email protected] | registries:
[email protected] | endpoint: “”
[email protected] | password: “”
[email protected] | skipverify: false
[email protected] | repository:
[email protected] | filter: []
[email protected] | runner:
[email protected] | local: false
[email protected] | image: drone/controller:1.0.0
[email protected] | platform: linux/amd64
[email protected] | os: linux
[email protected] | arch: amd64
[email protected] | kernel: “”
[email protected] | variant: “”
[email protected] | machine: 0c1d8c30c01f
[email protected] | capacity: 2
[email protected] | labels: {}
[email protected] | volumes: []
[email protected] | networks: []
[email protected] | devices: []
[email protected] | privileged: []
[email protected] | environ: {}
[email protected] | limits:
[email protected] | memswaplimit: 0
[email protected] | memlimit: 0
[email protected] | shmsize: 0
[email protected] | cpuquota: 0
[email protected] | cpushares: 0
[email protected] | cpuset: “”
[email protected] | nomad:
[email protected] | enabled: false
[email protected] | datacenters:
[email protected] | - dc1
[email protected] | namespace: “”
[email protected] | region: “”
[email protected] | prefix: drone-job-
[email protected] | image: “”
[email protected] | imagepull: false
[email protected] | memory: 1024
[email protected] | cpu: 500
[email protected] | kube:
[email protected]cker.com | enabled: false
[email protected] | namespace: “”
[email protected] | path: “”
[email protected] | url: “”
[email protected] | ttl: 300
[email protected] | serviceaccountname: “”
[email protected] | pullpolicy: Always
[email protected] | image: “”
[email protected] | rpc:
[email protected] | server: “”
[email protected] | secret: d859a7aa9453f47ff567939024b0f7c2
[email protected] | debug: false
[email protected] | host: drone.domain.in
[email protected] | proto: https
[email protected] | s3:
[email protected] | bucket: “”
[email protected] | prefix: “”
[email protected] | endpoint: “”
[email protected] | pathstyle: false
[email protected] | secrets:
[email protected] | endpoint: “”
[email protected] | password: “”
[email protected] | skipverify: false
[email protected] | server:
[email protected] | addr: https://drone.domain.in
[email protected] | host: drone.domain.in
[email protected] | port: :80
[email protected] | proto: https
[email protected] | acme: false
[email protected] | cert: “”
[email protected] | key: “”
[email protected] | session:
[email protected] | timeout: 720h0m0s
[email protected] | secret: eRrbTLF2Gn4cz8zyadJrITWTFkbXBa41
[email protected] | secure: false
[email protected] | status:
[email protected] | disabled: false
[email protected] | name: “”
[email protected] | users:
[email protected] | create:
[email protected] | username: “”
[email protected] | machine: false
[email protected] | admin: true
[email protected] | token: 55f24eb3d61ef6ac5e83d550178638dc
[email protected] | filter: []
[email protected] | minage: 0s
[email protected] | webhook:
[email protected] | endpoint: []
[email protected] | secret: “”
[email protected] | skipverify: false
[email protected] | yaml:
[email protected] | endpoint: “”
[email protected] | secret: “”
[email protected] | skipverify: false
dro[email protected] | bitbucket:
[email protected] | clientid: eFvdcZ7ZRtSRZHf9Le
[email protected] | clientsecret: xCKaqbZGqCVxyZ2B3m6fGPB676D8zfQQ
[email protected] | skipverify: false
[email protected] | debug: false
[email protected] | gitea:
[email protected] | server: “”
[email protected] | clientid: “”
[email protected] | clientsecret: “”
[email protected] | skipverify: false
[email protected] | scope:
[email protected] | - repo
[email protected] | - repo:status
[email protected] | - user:email
[email protected] | - read:org
[email protected] | debug: false
[email protected] | github:
[email protected] | server: https://github.com
[email protected] | apiserver: https://api.github.com
[email protected] | clientid: “”
[email protected] | clientsecret: “”
[email protected] | skipverify: false
[email protected] | scope:
[email protected] | - repo
[email protected] | - repo:status
[email protected] | - user:email
drone_drone-server.1.04o[email protected] | - read:org
[email protected] | ratelimit: 0
[email protected] | debug: false
[email protected] | gitlab:
[email protected] | server: https://gitlab.com
[email protected] | clientid: “”
[email protected] | clientsecret: “”
[email protected] | skipverify: false
[email protected] | debug: false
[email protected] | gogs:
[email protected] | server: “”
[email protected] | skipverify: false
[email protected] | debug: false
[email protected] | stash:
[email protected] | server: “”
[email protected] | consumerkey: “”
[email protected] | consumersecret: “”
[email protected] | privatekey: “”
[email protected] | skipverify: false
[email protected] | debug: false
[email protected] |
[email protected] | {“level”:“info”,“msg”:“main: internal scheduler enabled”,“time”:“2019-04-29T04:45:42Z”}
[email protected] | {“build.limit”:15000,“expires”:“0001-01-01T00:00:00Z”,“kind”:“trial”,“level”:“debug”,“msg”:“main: license loaded”,“repo.limit”:0,“time”:“2019-04-29T04:45:42Z”,“user.limit”:0}
[email protected] | {“interval”:“30m0s”,“level”:“info”,“msg”:“starting the cron scheduler”,“time”:“2019-04-29T04:45:42Z”}
drone_drone-server.1.04odf5a3x5e[email protected] | {“acme”:false,“host”:“drone.serviceurl.in”,“level”:“info”,“msg”:“starting the http server”,“port”:":80",“proto”:“https”,“time”:“2019-04-29T04:45:42Z”,“url”:“https://drone.domain.in”}

Please confirm as rpc is taking host as drone.domain.in and proto as https with port 80. which i think is wrong.

Drone Agent Configuration:

  • DRONE_PLUGIN_PULL=true
  • DRONE_LOGS_TRACE=true
  • DRONE_RPC_SERVER=http://drone-server
  • DRONE_RPC_SECRET=d859a7aa9453f47ff567939024b0f7c2
  • DRONE_RUNNER_CAPACITY=3
  • HTTP_PROXY=http://xxx.xx.xxx.xxx:3128
  • HTTPS_PROXY=http://xxx.xx.xxx.xxx:3128

Drone Agent Logs:

{“arch”:“amd64”,“level”:“debug”,“machine”:“02dc064a851a”,“msg”:“runner: polling queue”,“os”:“linux”,“time”:“2019-04-29T05:26:00Z”}
[DEBUG] POST http://drone-server/rpc/v1/request
[ERR] POST http://drone-server/rpc/v1/request request failed: Post http://drone-server/rpc/v1/request: context deadline exceeded
{“arch”:“amd64”,“level”:“debug”,“machine”:“02dc064a851a”,“msg”:“runner: polling queue”,“os”:“linux”,“time”:“2019-04-29T05:26:00Z”}
[DEBUG] POST http://drone-server/rpc/v1/request
[ERR] POST http://drone-server/rpc/v1/request request failed: Post http://drone-server/rpc/v1/request: context deadline exceeded
{“arch”:“amd64”,“level”:“debug”,“machine”:“02dc064a851a”,“msg”:“runner: polling queue”,“os”:“linux”,“time”:“2019-04-29T05:26:00Z”}
[DEBUG] POST http://drone-server/rpc/v1/request
[DEBUG] POST http://drone-server/rpc/v1/request (status: 503): retrying in 1s (30 left)

I see in your logs a 503 service unavailable error. This error code does not come from Drone. This status code is not used anywhere in the Drone source code. The only possible error codes that Drone will send are 400, 401, 404, 409, 500 and 524 which can be audited and verified in this source file.

[DEBUG] POST http://drone-server/rpc/v1/request (status: 503): retrying in 1s (30 left)

This would imply the request never even makes it to the Drone server. If the connection were successful you would see an entry in the server logs for manager: request queue item. Perhaps there is some intermediate networking software that is intercepting the request, and returning the 503? Either way, it is clear the request is not making it to Drone.

Hi bradrydzewski,

Thanks fore replying, I have described my setup in above reply. Can u please go through that. And please clarify DRONE_SERVER_HOST is it container hostname or callback url configured in bitbucket OAuth.

DRONE_SERVER_HOST is the public hostname, e.g. DRONE_SERVER_HOST=company.drone.io. If this were set incorrectly you would not be able to properly configure webhooks. Please note that this value does not have any bearing on the agent’s ability to connect with the server.

Maybe the problem is that you setup HTTP_PROXY variables which is intercepting and routing agent-to-server requests through the proxy? Perhaps you need to add the drone server address to NO_PROXY?

Either way the trace logs you provided make it clear that the http request to http://drone-server/rpc/v1/request is being intercepted by something else that is returning a 503. This is where I recommend you focus your debugging efforts.

For a time being i removed my http_proxy , now no request forward to proxy, i am using overlay network of docker swarm, Which mean it drone-agent is able to communicate drone-server using its service name. that’s what i assigned using

  • DRONE_RPC_SERVER=http://drone-server

I am getting status 524 . This means that drone-agent not able to contact drone-server.

Earlier I have same setup for drone:0.8.0 and it was running without any problem.

Hi bradrydzewski,

Thanks for you valuable support. There was problem with proxy settings only, agent is hitting proxy server instead of overlay. now builds are picked up as soon as request from webhook.

Just want to confirm about drone-server environment DRONE_ESCALATE which i used in version 0.8.0 is it deprecated , as i haven’t found any documentation for this in version 1.1.0