Skip to content

Argo Workflows for processing Aerial Imagery

Notifications You must be signed in to change notification settings



Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation

Topo Workflows

Topo workflows are run on a AWS EKS Cluster using Argo Workflows

To get setup you need access to the Argo user role inside the EKS cluster, you will need to contact someone from Topo Data Engineering to get access, all Imagery maintainers will already have access.

If creating your own workflow, or interested in the details of a current workflow please also read the


You will need

Ensure you have kubectl aliased to k

alias k=kubectl

To connect to the EKS cluster you need to be logged into AWS

aws-azure-login :account-name

Then to setup the cluster, only the first time using the cluster you need to run this

You will need a AWS CLI > 2.7.x

# For Imagery maintainers you will already have the correct role so no role arn is needed.
aws eks update-kubeconfig --name Workflow --region ap-southeast-2

# For AWS Admin users you will need to find the correct EKS role to use
aws eks update-kubeconfig --name Workflow --region ap-southeast-2 --role-arn arn:aws:iam::...

to validate the cluster is connected,

k get nodes

NAME                                               STATUS   ROLES    AGE    VERSION
ip-255-100-38-100.ap-southeast-2.compute.internal   Ready    <none>   7d   v1.21.12-eks-5308cf7
ip-255-100-39-100.ap-southeast-2.compute.internal   Ready    <none>   7d   v1.21.12-eks-5308cf7

to make the cli access easier you can set the default namespace to argo

k config set-context --current  --namespace=argo

Submitting a job

Once the cluster connection is setup a job can be submitted with the cli or accessed via the running argo-server

argo submit --watch workflows/imagery/standardising.yaml

To open the web interface:

# Create a connection to the Argo server
k port-forward deployment/argo-workflows-server 2746:2746

xdg-open http://localhost:2746

Submit a Job Using the Argo UI

In the Workflows page:

  2. Edit using full workflow options
  4. (Locate File -> Open)
  5. + CREATE

Debugging Argo Workflows

Workflow Parameters


Workflow Logs


Logs in Elasticsearch

Elasticsearch is an analytics engine, it allows us to store, search and analyse AWS logs.
Elasticsearch can be accessed through

Example Filters:

⚠️ Make sure you are using li-topo-production* and set the correct time filter.

All Logs for a Workflow: : "imagery-standardising-v0.2.0-60-9b7dq"

All Logs for a pod:
Click on the pod in the Argo UI and scroll through the summary table to find the pod name. : "imagery-standardising-v0.2.0-60-9b7dq.create-config"

List Failed Stac Validation Logs: : "imagery-standardising-v0.2.0-60-9b7dq" and data.valid : False

Find a Basemaps URL: : "imagery-standardising-v0.2.0-60-9b7dq" and data.url : *


data.title : "Wellington Urban Aerial Photos (1987-1988) SN8790" and data.url : *

Container version used

kubernetes.container_hash field, available in Elastic Search, gives the container hash that was used to run the task. It allows to get the version from the container registry for further investigations.

Workflow Artifacts

All workflow outputs and logs are stored in the artifacts bucket, in the linz-workflow-artifacts bucket on the li-topo-prod account.

All outputs follow the same naming convention:


For each pod the logs are saved as a main.log file within the related prefix.

Unless a different location is specified within the workflow code, output files will be uploaded to the corresponding prefix.

Note: This bucket has a 90 day expiration lifecycle.

Connecting to a Container

List pods:

k get pods -n argo
# note: if the default namespace is set to argo, `-n argo` is not required.

In the output next to the NAME of the pod, the READY column indicates how many Docker containers are running inside the pod. For example, 1/1 indicates there is one Docker container.

The output of the follow command includes a Containers section. The first line in this section is the container name, for example, argo-server.

k describe pods *pod_name* -n argo

To access a container in a pod run:

k exec -it -n argo *pod_name* -- /bash/bash

Once inside the container you can run a number of commands. For example, if trouble shooting network issues, you could run the following:

watch -e nslookup


See Concurrency for details on how to set limits on how many workflow instances can be run concurrently.


error: exec plugin: invalid apiVersion ""

Upgrade aws cli to > 2.7.x

Using containers

Some tasks in the Workflows or WorkflowsTemplates use a container to run from. These containers are build from other repository, such as, or Different tags are published for each of these containers:

  • latest
  • vX.Y.Z
  • vX.Y
  • vX

The container version are managed by a workflow parameter that needs to be specified when submitting the workflow. The default value is the last major version of the container. Using the major version tag (vX) with imagePullPolicy: Always ensures that all minor versions are included when running a workflow using these containers.


This tag should never be used in production as it points to the latest build of the container which could be an unstable version. We reserve this tag for testing purposes.

:vX.Y.Z, :vX.Y, :vX

These tags are intended to be use in production as they will be published for each stable release of the container.

  • :vX.Y will change dynamically as Z will be incremented.
  • :vX will change dynamically as Y and Z will be incremented.

Using different versions

For each Workflow and WorkflowTemplate, there is a parameter version-* that allows to specify the version of the LINZ container to use.


Argo Workflows for processing Aerial Imagery






No releases published


No packages published