Free-Log-Service Failures

This document walks you through the steps to follow on a Harness NextGen to identify Free-Log-Service Failures and How to troubleshoot the same.

1. This issue may arise because of the following reasons:
a. The communication from the delegate agent to build pod was not happening.
b. Communication to the delegate agent from the build pod is not happening.

2. How to identify the Free-Log-Service Failures
a. This can be observed while running the pipeline. Normally we will be provided with an error pop-up
like below this will be visible in Harness UI only if Feature Flag “CI_INDIRECT_LOG_UPLOAD” is
enabled on your account.

Note: “CI_INDIRECT_LOG_UPLOAD” FF takes 60 minutes to get activated in the account.

b. If Feature Flag “CI_INDIRECT_LOG_UPLOAD” is not enabled on your account then you can ask the Support team to enable it for your account and re-run the pipeline or you can see the same in your backend pods logs. Something like this :

		log_key:\\\"accountId:XXXXXXXXXXXXXXX/orgId:XXX/projectId:XXXXXX/pipelineId:test/runSequence:XXXX/level0:pipeline/level1:stages/level2:build/level3:spec/level4:execution/level5:steps/level6:alpine\\\" account_id:\\\"XXXXXXXXXXXXX\\\" container_port:20002\",\"step_id\":\"alpine\"}}"}
		time="2022-02-23T05:45:52Z" level=warning msg="http: request error. Retrying ..." error="Put \"https://storage.googleapis.com/free-log-service/XXXXXXXXXXXXX/accountId%XXXXXXXXXXX/orgId%3ANEF/projectId%3Adavidtest/pipelineId%3Atest/runSequence%3A9/level0%3Apipeline/level1%3Astages/level2%3Abuild/level3%3Aspec/level4%3Aexecution/level5%3Asteps/level6%3Aalpine?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=GOOG1EIRML2LPVSNQ5SOOPFC6JJJOGVFCJ34I55YBVUBKT3UL7IVPDNLEOZJA%2F20220223%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220223T054341Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=c93e04539ed0de8d333901e52bf0ea17b278ff32efdadac39f92089d2adc53d7\":Forbidden" path=PUT

3. How to resolve the Free-Log-Service Failures
a. First we need to verify is there any communication issue observed between from delegate agent to the build pod. If it is not happening then it will be because of the security group.

b. If you are using EKS Cluster then a security group needs to be added to each build pod.

c. Then We need to verify communication to the delegate agent from the build pod. If it is not happening or not we can get it verified by doing grpcurl from the build pod.

d. If it is failing then it will be because of an issue with the security group of the ingress controller where we need to allow incoming connections on port 8080 since the build pod talks to the delegate service via its cluster IP.