Running redash on Kubernetes

Redash is a very handy tool that allows for you to connect to various data sources and produce interesting graphs. Your BI people most likely love it already.

Redash makes use of Redis, Postgres and a number of services written in Django as can be seen in this example docker-compose.yml file. However, there is very scarce information on how to run it on Kubernetes. I suspect that part of the reason is that while docker-compose.yml makes use of YAML’s merge, kubectl does not allow for this. So there exist templates that make a lot of redundant copies of a large block of lines. There must be a better way, right?

Since the example deployment with docker-compose runs all services on a single host, I decided to run my example deployment in a single pod with multiple containers. You can always switch to a better deployment to suit your needs if you like afterwards.

Next, was my quest on how to deal with the redundancy needed for the environment variables used by the different Redash containers. If only there was a template or macro language I could use. Well the most readily available, with the less installation hassle (if not already on your system) is m4. And you do not have to do weird sendmail.cf stuff as you will see. Using m4 allows us to run something like m4 redash-deployment-simple.m4 | kubectl apply -f - and be done with it:

divert(-1)
define(redash_environment, `
        - name: PYTHONUNBUFFERED
          value: "0"
        - name: REDASH_REDIS_URL
          value: "redis://127.0.0.1:6379/0"
        - name: REDASH_MAIL_USERNAME
          value: "redash"
        - name: REDASH_MAIL_USE_TLS
          value: "true"
        - name: REDASH_MAIL_USE_SSL
          value: "false"
        - name: REDASH_MAIL_SERVER
          value: "mail.example.net"
        - name: REDASH_MAIL_PORT
          value: "587"
        - name: REDASH_MAIL_PASSWORD
          value: "password"
        - name: REDASH_MAIL_DEFAULT_SENDER
          value: "redash@mail.example.net"
        - name: REDASH_LOG_LEVEL
          value: "INFO"
        - name: REDASH_DATABASE_URL
          value: "postgresql://redash:redash@127.0.0.1:5432/redash"
        - name: REDASH_COOKIE_SECRET
          value: "not-so-secret"
        - name: REDASH_ADDITIONAL_QUERY_RUNNERS
          value: "redash.query_runner.python"
')

divert(0)
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redash
  labels:
    app: redash
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redash
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: redash
    spec:
      containers:
      - name: redis
        image: redis
        ports:
        - name: redis
          containerPort: 6379
      - name: postgres
        image: postgres:11
        env:
        - name: POSTGRES_USER
          value: redash
        - name: POSTGRES_PASSWORD
          value: redash
        - name: POSTGRES_DB
          value: redash
        ports:
        - name: postgres
          containerPort: 5432
      - name: server
        image: redash/redash
        args: [ "server" ]
        env:
        - name: REDASH_WEB_WORKERS
          value: "2"
        redash_environment
        ports:
        - name: redash
          containerPort: 5000
      - name:  scheduler
        image: redash/redash
        args: [ "scheduler" ]
        env:
        - name: QUEUES
          value: "celery"
        - name: WORKERS_COUNT
          value: "1"
        redash_environment
      - name: schedulded-worker
        image: redash/redash
        args: [ "worker" ]
        env:
        - name: QUEUES
          value: "scheduled_queries,schemas"
        - name: WORKERS_COUNT
          value: "1"
      - name: adhoc-worker
        image: redash/redash
        args: [ "worker" ]
        env:
        - name: QUEUES
          value: "queries"
        - name: WORKERS_COUNT
          value: "1"
        redash_environment
---
apiVersion: v1
kind: Service
metadata:
  name: redash-nodeport
spec:
  type: NodePort
  selector:
    app: redash
  ports:
  - port: 5000
    targetPort: 5000

You can grab redash-deployment.m4 from Pastebin. What we did above was to define the macro redash_environment (with care for proper indentation) and use this in the container definitions in the Pod instead of copy-pasting that bunch of lines four times. Yes, you could have done it with any other template processor too.

You’re almost done. Postgres is not configured so, you need to connect and initialize the database:

$ kubectl exec -it redash-f8556648b-tw949 -c server -- bash
redash@redash-f8556648b-tw949:/app$ ./manage.py database create_tables
:
redash@redash-f8556648b-tw949:/app$ exit
$

I used the above configuration to quickly launch Redash on a Windows machine that runs the Docker Desktop Kubernetes distribution. For example no permanent storage for Postgres is defined. In a production installation it could very well be that said Postgres lives outside the cluster, so there is no need for such a container. The same might hold true for the Redis container too.

What I wanted to demonstrate, was that due to this specific circumstance, a 40+ year old tool may come to your assistance without needing to install any other weird templating tool or what. And also how to react in cases where you need !!merge and it is not supported by your parser.