[Canary Deployment] How to deploy a TCP service with Istio Traffic Split on a Canary Workflow

Hello everyone :vulcan_salute:
Harness Logo


Harness has a lot of customers that use one of the great powers that our platform provides with CD, Canary deployment, and one of the best, and most challenging, things we have with canary is Traffic Split. With several services involved, the power of Istio stands out along with our platform.

Let’s see how Harness can overcome the current design and use Istio together with Canary Deployments for TCP Services.



We’ll work in a scenario where the Kubernetes cluster already has an Istio installed also we’ll use the Istio ingress gateway and for this example, I’m going to use the deployment of a tcp-echo-service service that is provided by the Istio team for tests and examples (I just modified it to use some values ​​obtained from Harness). The other important resources are from Istio the DestinationRule, Gateway, and VirtualService. You can find all of them in my GitHub repository at this link.

  • DestinationRule

    As we will be creating a Canary Deployment to deploy our TCP echo server our DestinationRule will define the policies that will be applied to traffic intended for service after routing has occurred, also It’s important to define the subsets for canary and stable versions in our DestinationRule because it can be used for scenarios like A/B testing, or routing to a specific version of a service. So in our scenario, we’ll configure the subset based on the harness label:


  • Gateway

    An ingress Gateway describes a load balancer operating at the edge of the mesh that receives incoming HTTP/TCP connections. It configures exposed ports, protocols, etc. but, unlike Kubernetes Ingress Resources, does not include any traffic routing configuration, Spoiler alert, this is where our VirtualService comes in. Traffic routing for ingress traffic is instead configured using Istio routing rules, exactly in the same way as for internal service requests. I’m using Istio default gateway implementation.

  • VirtualService

    Now that we have our DestinationRule with named subsets we’ll create the VirtualService that basically defines a set of traffic routing rules to apply when a host is addressed, we have the routing rules that define matching criteria for the traffic of a specific protocol, and if the traffic matches, it will be sent to a named destination service (or subset/version of it) defined. Basically, we created two subsets, one canary and one stable, in our VirtualService we will define port 9000 to target these subsets, but we will add the weight for each one, weights associated with the subset determine the proportion of traffic it receives. For example, the following rule will route, by default, 0% of traffic for the “{{.Values.name}}-svc” service to instances with the “canary” tag and the remaining traffic (i.e., 100%) to “stable”. Later on, let’s talk about the dynamic variables used in the weights of our Virtual Service.

:warning: For traffic management using the Traffic Split step, Harness only supports HTTP in the VirtualService manifest so we’ll need to skip it in the Canary Deployment process and apply it using the Apply Step so as to have a Workflow ignore a resource file in a Service Manifest section, you add the following comment to the top of the file that’s why we can see this annotation present in our VirtualService.


First Step

Let’s create our Service! Some considerations in creating our service are that the Deployment Type for our example will be Kubernetes and I have added a dummy artifact for it, as my service uses a specific image which is istio/tcp-echo-server:1.2 so there is no need for a specific setting for our Artifact Source.

Second Step

Let’s create an environment where our service will be deployed, there’s no secret here, basically, you define your target deployment infrastructure using a Harness Environment. Environments represent your deployment infrastructures, such as Dev, QA, Stage, Production, etc.

Third Step

Finally let’s now create our Canary Workflow and Add two Phases under the Deployment Phases, Canary and Primary. The Canary Phase creates a Canary deployment using your Service Manifests files and the number of pods you specify in the Workflow’s Canary Deployment step and The Primary Phase runs the actual deployment as a rolling update with the number of pods you specify in the Service Manifests files.

Workflow Overview

Canary Phase Overview

Primary Phase Overview

Now that everything is ready, we just need to add Traffic Split to our Workflow!.. Not so fast young Padawan.

For traffic management using the Traffic Split step, Harness only supports HTTP in the VirtualService and we’ll use it for TCP traffic, so so we’ll need to skip the VirtualService deploy in the Canary Deployment process and apply it using the Apply Step, and this is very important information because If we will use the Istio traffic shifting using the Apply step, we’ll need to move the Canary Delete step from Wrap Up section of the Canary phase to the Wrap Up section of the Primary phase. Moving the Canary Delete step to the Wrap Up section of the Primary phase will prevent any traffic from being routed to deleted pods before traffic is routed to stable pods in the Primary phase. So let’s do this!

Fourth Step

Let’s move the Canary Delete step from Wrap Up section of the Canary phase to the Wrap Up section of the Primary phase. To do that, we can remove the Canary Delete step from the Canary Phase:

Now we’ll add it to the Primary Phase in the Wrap Up section

With this, we have already solved our potential issue of traffic being routed to excluded pods before traffic is routed to stable pods. Now let’s finally add our Traffic Split.

Fifth Step

Let’s go to our Canary Phase and add a new Section inside it called Traffic Split 10/90 which will contain an Approval step so I can export variables that will be used in our VirtualService template to override the default values and the Apply Step to apply our VirtualService to route, 10% of traffic for the “{{.Values.name}}-svc” service to instances with the “canary” tag and the remaining traffic (i.e., 90%) to “stable”.

I will also add an intermediate phase that will also contain an approval and apply step only, that will update the Traffic Split to 20/80 (20% canary and 80% stable)

To use the values ​​of this variable exported by our Approval step, I’ve added these fields to the values.yaml of our service manifest

canaryWeight: ${canaryTrafficWeight.variables.canary}
stableWeight: ${canaryTrafficWeight.variables.stable}

Now, after performing our Traffic Split, we can finish and perform the rolling update with our Primary Phase.

Final Step

As soon as the rollout happens and before the Canary Delete step that is in the Wrap Up section of the Primary Phase clears our Canary Deployment, we need to return 100% Traffic to the stable version of our application, so at this stage, I also added an Approval Step to split the entire Traffic to stable again and then use the exported variables for this step in our VirtualService which will be applied right before the Canary Deployment Delete.


I created a script to perform a check on our services and it can be found in the repository I shared at the beginning of this article in the helpers folder. Basically, I send a timestamp to the TCP server and see its echo, this test is for us to see the traffic split occurring in practice.

When running this script before launching our Canary Workflow, this is the output:

Now, let’s change our service to show two instead of one

And trigger our workflow applying only the 10/90 split

Now, our check.sh script should show 10% of requests being directed to Canary Deployment and showing some servers answering two instead of one

Now when our workflow is completed we should see all traffic going to the new stable version

At this time our check.sh script should show 100% of requests being directed to Principal Deployment and showing all servers answering two

Now you can start exploring more possibilities with Harness and Istio :rocket: