Drone 0.8.2 Build stuck

Hello,

As title says i’ve notice that my builds are stucking randomly, the most recent one is weird. it actually completed all the pipeline but still says “Running”, when i hit cancel build it says that it successfully cancel it but that build still continues to run and no other job is taken.

Here’s screenshot

docker version:

[email protected]:~# docker version
Client:
 Version:      17.09.0-ce
 API version:  1.32
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:42:45 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.0-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:41:24 2017
 OS/Arch:      linux/amd64
 Experimental: false
[email protected]:~#

docker-compose.yml:

version: "2.1"

services:
  server:
    image: drone/drone:0.8.2
    ports:
      - 8585:8000
      - 9000
    volumes:
      - ./data:/var/lib/drone/
    restart: unless-stopped
    environment:
      - DRONE_OPEN=false
      - DRONE_ADMIN=UnAfraid
      - DRONE_GITHUB=true
      - DRONE_GITHUB_CLIENT=<github client>
      - DRONE_GITHUB_SECRET=<github secret>
      - DRONE_GITHUB_MERGE_REF=false
      - DRONE_SECRET=<secret>
      - DRONE_HOST=https://drone.mydomain.com

  agent:
    image: drone/agent:0.8.2
    command: agent
    restart: always
    depends_on: [ server ]
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - DRONE_SERVER=server:9000
      - DRONE_SECRET=<secret>

docker logs drone_agent_1

{"time":"2017-12-07T22:55:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"received execution"}
{"time":"2017-12-07T22:55:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"listen for cancel signal"}
{"time":"2017-12-07T22:55:02Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","exit_code":0,"exited":false,"message":"update step status"}
{"time":"2017-12-07T22:55:02Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","exit_code":0,"exited":false,"message":"update step status complete"}
{"time":"2017-12-07T22:55:04Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","message":"log stream opened"}
{"time":"2017-12-07T22:56:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"pipeline lease renewed"}
{"time":"2017-12-07T22:56:25Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","message":"log stream copied"}
{"time":"2017-12-07T22:56:25Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","message":"log stream uploading"}
{"time":"2017-12-07T22:56:25Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","message":"log stream upload complete"}
{"time":"2017-12-07T22:56:25Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","message":"log stream closed"}
{"time":"2017-12-07T22:56:26Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","exit_code":0,"exited":true,"message":"update step status"}
{"time":"2017-12-07T22:56:26Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"plugins/git:latest","stage":"clone","exit_code":0,"exited":true,"message":"update step status complete"}
{"time":"2017-12-07T22:56:26Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","exit_code":0,"exited":false,"message":"update step status"}
{"time":"2017-12-07T22:56:26Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","exit_code":0,"exited":false,"message":"update step status complete"}
{"time":"2017-12-07T22:56:28Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","message":"log stream opened"}
{"time":"2017-12-07T22:56:28Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","message":"log stream copied"}
{"time":"2017-12-07T22:56:28Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","message":"log stream uploading"}
{"time":"2017-12-07T22:56:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","message":"log stream upload complete"}
{"time":"2017-12-07T22:56:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","message":"log stream closed"}
{"time":"2017-12-07T22:56:30Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","exit_code":0,"exited":true,"message":"update step status"}
{"time":"2017-12-07T22:56:30Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"restore-cache","exit_code":0,"exited":true,"message":"update step status complete"}
{"time":"2017-12-07T22:56:30Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","exit_code":0,"exited":false,"message":"update step status"}
{"time":"2017-12-07T22:56:30Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","exit_code":0,"exited":false,"message":"update step status complete"}
{"time":"2017-12-07T22:56:31Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","message":"log stream opened"}
{"time":"2017-12-07T22:57:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"pipeline lease renewed"}
{"time":"2017-12-07T22:57:19Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","message":"log stream copied"}
{"time":"2017-12-07T22:57:19Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","message":"log stream uploading"}
{"time":"2017-12-07T22:57:19Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","message":"log stream upload complete"}
{"time":"2017-12-07T22:57:19Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","message":"log stream closed"}
{"time":"2017-12-07T22:57:20Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","exit_code":0,"exited":true,"message":"update step status"}
{"time":"2017-12-07T22:57:20Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"openjdk:8-jdk","stage":"build","exit_code":0,"exited":true,"message":"update step status complete"}
{"time":"2017-12-07T22:57:20Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"rebuild-cache","exit_code":0,"exited":false,"message":"update step status"}
{"time":"2017-12-07T22:57:21Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"rebuild-cache","exit_code":0,"exited":false,"message":"update step status complete"}
{"time":"2017-12-07T22:57:24Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"rebuild-cache","message":"log stream opened"}
{"time":"2017-12-07T22:57:25Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"rebuild-cache","exit_code":0,"exited":true,"message":"update step status"}
{"time":"2017-12-07T22:57:26Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"drillster/drone-volume-cache:latest","stage":"rebuild-cache","exit_code":0,"exited":true,"message":"update step status complete"}
{"time":"2017-12-07T22:57:26Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","exit_code":0,"exited":false,"message":"update step status"}
{"time":"2017-12-07T22:57:27Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","exit_code":0,"exited":false,"message":"update step status complete"}
{"time":"2017-12-07T22:57:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","message":"log stream opened"}
{"time":"2017-12-07T22:57:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","message":"log stream copied"}
{"time":"2017-12-07T22:57:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","message":"log stream uploading"}
{"time":"2017-12-07T22:57:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","message":"log stream upload complete"}
{"time":"2017-12-07T22:57:29Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","message":"log stream closed"}
{"time":"2017-12-07T22:57:30Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","exit_code":0,"exited":true,"message":"update step status"}
{"time":"2017-12-07T22:57:30Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","image":"appleboy/drone-telegram:latest","stage":"telegram","exit_code":0,"exited":true,"message":"update step status complete"}
{"time":"2017-12-07T22:57:31Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","error":"","exit_code":0,"message":"pipeline complete"}
{"time":"2017-12-07T22:57:31Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"uploading logs"}
{"time":"2017-12-07T22:58:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"pipeline lease renewed"}
{"time":"2017-12-07T22:59:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"pipeline lease renewed"}
{"time":"2017-12-07T23:00:01Z","level":"debug","repo":"UnAfraid/<my repo>","build":"336","id":"2251","message":"pipeline done"}

docker logs drone_server_1 has lots of spam with very same message

INFO: 2017/12/08 11:44:36 grpc: Server.processUnaryRPC failed to write status stream error: code = DeadlineExceeded desc = "context deadline exceeded"

After restart of container it started building another commits but this one still says Running

Now that’s even stranger

I’ve run 5 agents and strange things started appearing, maybe it happens when server is under load?

I had the same thing happen last week, only way I got around it was by killing the build via cli: drone build kill <repo> <build number>