Argo workflow as performance test tool

Argo workflow is an excellent tool for orchestrating load tests.

Why?

Let’s start to define why the load test is complicated.

You need lots of hardware only during the load test: To run a big load test, you will require multiple computers to push data to your system, and after the test is done, all this computing power can be powered off to save money.

Argo is Kubernetes based; therefore, it has auto-scaling by design, and it will scale down the test nodes as soon as the tests stop.

You will most likely use multiple technologies. There are great tools for load tests, K6, JMeter, locust, Goose, etc. You will have more than one technology to deal with in big projects, maybe all of them.

Argo runs Docker containers as its steps. Therefore, if the load test/scripts can be packed as a docker image, it will work on Argo gracefully.

They are hard to follow: While running, you need to be able to follow all scripts executing, watch errors and miss configurations and stop the tests if and when needed.

Argo has an easy and straightforward WEB UI, making it easy to start, kill, monitor, or watch test execution.

On top of all that, Argo is open-source, distributed under the Apache-2.0 license, and has great command-line tools. The documentation is vast, and it also serves other use cases, like machine learning, for example.

See this video if you don’t know Argo workflow.

I have been using Argo as a load test orchestration for two years and it has delivered great results. In this blog post, I show how it made the load tests for my open source project Nun-db easy to execute and follow.

Installation

I am assuming you already have Kubernetes clusters running. Then all you need to do is create the namespace for Argo to be installed and app the YAML from the official repository.


kubectl create namespace argo
kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.3.5/install.yaml

Validating the installation worked

To validate if Argo is working as expected at this point, let’s check what pods are running with the command kubectl get pods -n argo if the result looks like the following, it is all fine.

kubectl get pods -n argo
NAME                                   READY   STATUS    RESTARTS   AGE
argo-server-d74f6677f-wtl6j            1/1     Running   0          13m
workflow-controller-7449c69756-vl6kb   1/1     Running   0          13m

After installing, you will need a token to authenticate. Use the next bash command to get the token to log in.

kubectl get secrets -n argo1 | grep argo-server-token | awk '{ print $1 }' | xargs -I '{}' kubectl get secrets -n argo1 {} -o jsonpath="{.data.token}" | base64 -d | awk '{ print "Bearer " $1 }'

Opening the Web-UI

Now it is time to open Argo on your browser. For that, execute the command to port-forward the Argo port to your local environment and open it on the local with HTTPS on your browser.

kubectl -n argo port-forward deployment/argo-server 2746:2746

Then open the URL: https://localhost:2746/workflows/argo1?limit=50

Now you have Argo running and working on your own cluster. Time to run some load tests against your system.

In this post, I will go over one simple tests I use for Nun-db and show a workflow with two tests running on Argo.

First test with K6 load testing and metrics

The details of the following test are not necessary for this subject. Nevertheless, in that script, I measure the time from setting a key and receiving the notification of the change in the second connection. Add that as a Trend metric so I can plot it in other tools like Grafana. In this test, I fail if the P95th of my metric is bigger than 100ms.

import ws from 'k6/ws';
import {
    check
} from 'k6';

import {
    Trend
} from 'k6/metrics';

const watchGage = new Trend('watch', true);

export default function() {
    const url = 'wss://omitted-for-obvious-reasons';
    const params = {};
    const keyName = `name_${__VU}_${Date.now()}`;

    const response = ws.connect(url, params, function(socket) {
        socket.on('open', function open() {
            console.log('connected');

            socket.send('auth mateus mateus')
            socket.send('create-db vue vue_pwd');
            socket.send('use-db vue vue_pwd');
            socket.send(`remove ${keyName}`);
            socket.send(`watch ${keyName}`);
            socket.setInterval(function timeout() {
                socket.send(`set ${keyName} mateus-${Date.now()}`);
                console.log('Pinging every 1sec (setInterval test)');
            }, 1000);
        });

        socket.on('message', (data) => {
            console.log(data);
            if (~data.indexOf('changed')) {
                const now = Date.now();
                const time = data.split('-')[1];
                const spent = now - time;
                watchGage.add(spent);
            }
            console.log('Message received: ', data)
        });


        socket.on('close', () => console.log('disconnected'));

        socket.on('error', (e) => {
            if (e.error() != 'websocket: close sent') {
                console.log('An unexpected error occurred: ', e.error());
            }
        });

        socket.setTimeout(function() {
            console.log('2 seconds passed, closing the socket');
            socket.close();
        }, 20000);
    });

    check(response, {
        'status is 101': (r) => r && r.status === 101
    });
}

export const options = {
    stages: [{
            duration: '5m',
            target: 10
        }, // simulate ramp-up of traffic from 1 to 100 users over 5 minutes.
        {
            duration: '5m',
            target: 100
        }, // stay at 100 users for 10 minutes
        {
            duration: '5m',
            target: 200
        }, // stay at 100 users for 10 minutes
        {
            duration: '5m',
            target: 50
        }, // stay at 100 users for 10 minutes
        {
            duration: '5m',
            target: 0
        }, // ramp-down to 0 users
    ],
    thresholds: {
        'watch': ['p(95)<100''], 
    },
};

That all this file ws.js.

Now we have our first load test script to run. First, we need to create and docker to run our test.

Tests file Dockerfile

FROM grafana/k6
COPY ./*.js ./
Entrypoint ["k6", "run", "--vus", "10", "--duration", "30s", "ws.js"]

Pushing it to docker hub

    docker build -t mateusfreira/nun-db-k6s-test .
    docker push mateusfreira/nun-db-k6s-test

Now it is time to create the workflow.

On the Web-UI, click on Workflow templates and Create a new workflow template.

Now you can copy the following workflow I am putting in this blog post to your workflow template, but I will point you to the essential points.

metadata:
  name: nun-db-load-test
  namespace: argo1
  labels:
    nunDb: 'true'
spec:
  templates:
    - name: full-load
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: send-slack-notification
            template: send-slack-notification
            arguments: {}
        - - name: nun-db-k61
            template: nun-db-k6
            arguments: {}
          - name: nun-db-k62
            template: nun-db-k6
            arguments: {}
          - name: nodejs-test
            template: nun-db-node-js
            arguments: {}
        - - name: send-slack-notification-end
            template: send-slack-notification
            arguments: {}
    - name: send-slack-notification
      inputs:
        parameters:
          - name: message
            value: ''
      outputs: {}
      metadata: {}
      container:
        name: main
        image: 'argoproj/argosay:v2'
        command:
          - /argosay
        args:
          - echo
          - ''
        resources: {}
    - name: nun-db-node-js
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: nodejs
        image: 'mateusfreira/nun-db-node-test:amd64'
        command:
          - sh
        args:
          - '-c'
          - node
          - index.js
        env:
          - name: URL
            value: ''
          - name: DB
            value: test
          - name: PWD
            value: test-pwd
        resources: {}
    - name: nun-db-k6
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: k6
        image: mateusfreira/nun-db-k6s-test
        command:
          - k6
        args:
          - run
          - '--vus'
          - ''
          - '--duration'
          - ''
          - ws.js
        env:
          - name: URL
            value: ''
          - name: USERNAME
            value: ''
          - name: PWD
            value: ''
        resources: {}
  entrypoint: full-load
  arguments:
    parameters:
      - name: message
        value: Running argo tests for nun-db
      - name: userName
        value: nun
      - name: pwd
        value: Empty
      - name: url
        value: 'Omited for obvius reasons'
      - name: k6Duration
        value: 30s
      - name: k6Vus
        value: '10'
  ttlStrategy:
    secondsAfterCompletion: 300
  podGC:
    strategy: OnPodCompletion
  workflowMetadata:
    labels:
      nunDb: 'true'

Here is how it looks like when running

That script will look like the one I am showing next. A pretty standard workflow for a load test. You want to run multiple tests and notify about starting and the end if it succeeds.

nun-db-workflow

Important parts of the workflow.

Defining the workflow

Here is where you define what the shape of your workflow will look like. When you see --, it means it will run after the previews step, and - will run parallels in the prev step. In this case, we run send-slack-notification alone. Then nun-db-k61 nun-db-k62 and nun-db-node-js all at the same time. Then, in the end, we run the send-slack-notification-end. That will give you the shape of your workflow.

...
    - name: full-load
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: send-slack-notification
            template: send-slack-notification
            arguments: {}
        - - name: nun-db-k61
            template: nun-db-k6
            arguments: {}
          - name: nun-db-k62
            template: nun-db-k6
            arguments: {}
          - name: nodejs-test
            template: nun-db-node-js
            arguments: {}
        - - name: send-slack-notification-end
            template: send-slack-notification
            arguments: {}

Defining steps

The most critical parts in this block of code are the docker image, the parameters it takes, and the command to run. In the K6 example, we can see how we can define what command to run on that step. There we run the command K6 and use the parameters from the workflow to define how many threads --vus I want to test and for how long it will run --duration. Then finally, piking URL user and password to use and the environment vars present in my script. Those are URL, USER, PWD. These environment vars are used in the test script to perform my desired test.

    - name: nun-db-k6
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: k6
        *image: mateusfreira/nun-db-k6s-test*
        *command:*
          - k6
        args:
          - run
          - '--vus'
          - ''
          - '--duration'
          - ''
          - ws.js
        env:
          - name: URL
            value: ''
          - name: USERNAME
            value: ''
          - name: PWD
            value: ''
        resources: {}

Making your workflow sexier

Use JSON to make your workflow more sexy and useful. Here is an example. Let’s say we want to run one of our K6 test scripts for several different configurations? I can use JSON for that. E.g

[
    {
     "url": "someUrl",
     "vus": 2,
    },
    {
     "url": "someOtherUrl",
     "vus": 10,
    },
    {
     "url": "someOtherUrl",
     "vus": 3,
    },
]

With that, you can configure some of your steps to run for each item in this JSON config, for example (here, I am changing the step nodejs-test from the prev code):

...
          - name: nodejs-test
            template: nun-db-node-js
            arguments:
              parameters:
                - name: db
                  value: ''
                - name: pwd
                  value: ''
            withParam: ''
...
entrypoint: full-load
  arguments:
    parameters:
...
      - name: nodeTestConfigs
        value: |
          [{ "db": "test", "pwd": "test-pwd" }, { "db": "vue", "pwd": "vue-pwd"
          }]

Final YAML version

metadata:
  name: nun-db-load-test
  namespace: argo1
  labels:
    nunDb: 'true'
    workflows.argoproj.io/creator: system-serviceaccount-argo1-argo-server
  managedFields:
    - manager: argo
      operation: Update
      apiVersion: argoproj.io/v1alpha1
      time: '2022-05-12T10:57:17Z'
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:labels':
            .: {}
            'f:nunDb': {}
            'f:workflows.argoproj.io/creator': {}
        'f:spec': {}
spec:
  templates:
    - name: full-load
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: send-slack-notification
            template: send-slack-notification
            arguments: {}
        - - name: nun-db-k61
            template: nun-db-k6
            arguments: {}
          - name: nun-db-k62
            template: nun-db-k6
            arguments: {}
          - name: nodejs-test
            template: nun-db-node-js
            arguments:
              parameters:
                - name: db
                  value: ''
                - name: pwd
                  value: ''
            withParam: ''
        - - name: send-slack-notification-end
            template: send-slack-notification
            arguments: {}
    - name: send-slack-notification
      inputs:
        parameters:
          - name: message
            value: ''
      outputs: {}
      metadata: {}
      container:
        name: main
        image: 'argoproj/argosay:v2'
        command:
          - /argosay
        args:
          - echo
          - ''
        resources: {}
    - name: nun-db-node-js
      inputs:
        parameters:
          - name: db
          - name: pwd
      outputs: {}
      metadata: {}
      container:
        name: nodejs
        image: 'mateusfreira/nun-db-node-test:amd64'
        command:
          - sh
        args:
          - '-c'
          - node
          - index.js
        env:
          - name: URL
            value: ''
          - name: DB
            value: ''
          - name: PWD
            value: ''
        resources: {}
    - name: nun-db-k6
      inputs: {}
      outputs: {}
      metadata: {}
      container:
        name: k6
        image: mateusfreira/nun-db-k6s-test
        command:
          - k6
        args:
          - run
          - '--vus'
          - ''
          - '--duration'
          - ''
          - ws.js
        env:
          - name: URL
            value: ''
          - name: USERNAME
            value: ''
          - name: PWD
            value: ''
        resources: {}
  entrypoint: full-load
  arguments:
    parameters:
      - name: message
        value: Running argo tests for nun-db
      - name: userName
        value: nun
      - name: pwd
        value: Empty
      - name: url
        value: 'wss://omitted-for-obvious-reasons'
      - name: k6Duration
        value: 30s
      - name: k6Vus
        value: '10'
      - name: nodeTestConfigs
        value: |
          [{ "db": "test", "pwd": "test-pwd" }, { "db": "vue", "pwd": "vue-pwd"
          }]
  ttlStrategy:
    secondsAfterCompletion: 300
  podGC:
    strategy: OnPodCompletion
  workflowMetadata:
    labels:
      nunDb: 'true'

… done

Conclusion

Load test is a crucial step for any successful software that do not want to be surprised in production.

They are useful for predicting problems before they happen in a production environment. Nevertheless they are hard to build and may get complicated to execute, follow, and manage once they get many different technologies, steps, and services. Argo plays nicely in this environment to orchestrate this complexity and make it easy. There are countless closed-source tools to address the same problem, but Argo does the job for free and opens an all-new world of possibilities to be explored. I recommend Argo for doing load tests; it is my first choice for the job. See you in the next post.

Complementary reading

Testing different network configurations utility

https://pypi.org/project/tcconfig/

Performance test with k6

https://k6.io/

Performance Jmeter

https://jmeter.apache.org/

Performance test in rust (new project)

https://github.com/tag1consulting/goose

Written on July 10, 2022