OpenVPN, LDAP and group membership

While the need for LDAP integration and OpenVPN seems straightforward, it seems to me that the documentation for the auth-ldap plugin is not very easy to locate and find. Take for example the following auth-ldap.conf configuration file

<LDAP>
URL ldap://ldap.example.com
Timeout 15
</LDAP>
<Authorization>
BaseDN "ou=users,dc=example,dc=com"
SearchFilter "(uid=%u)" # (or choose your own LDAP filter for users)
RequireGroup false
</Authorization>

This is a very handy starter that would allow any user with a working password under the ou=users part of your tree to be granted access. But what if you would want to restrict access based on group membership? According to fragments of documentation scattered at different bits of forums and StackOverflow / ServerFault, you’d need to set RequireGroup true and then use the BaseDN of the group and the memberUid attribute within a <Group> ... </Group> subsection of Authorization. This never worked for me. What worked was changing the Search filter to include group membership:

<LDAP>
URL ldaps//ldap.example.com
Timeout 15
</LDAP>
<Authorization>
BaseDN "ou=users,dc=example,dc=com"
SearchFilter "(&(uid=%u)(memberOf=cn=openvpn,ou=groups,dc=example,dc=com))"
RequireGroup false
</Authorization>

Voila!

I did not come up with this. I found it via random Googling somewhere in SO (I cannot remember and cite that answer anymore).

X-Google-Smtp-Source

After some years (this blog was known as blog.postmaster.gr in the past afterall) I had to do some digging into an email’s headers. I couldn’t find what I was looking for, but there was a header inserted by the Google Workspace system named X-Google-Smtp-Source. It had a base64 encoded value. Decoding it did not produce any immediately meaningful result, so I dug a bit more. Searching for this header brings pages that repeat the following information from Google:

X-Google-Smtp-Source: This header shows the IP address of the computer that sent the email.

https://support.google.com/mail/thread/230509179/need-to-know-email-sent-recieved-times?hl=en

This does not provide much information. I tried to look for more, and even asked ChatGPT and Bard but they too came up with nothing helpful. Until I reached the following mail thread:

:
These headers are typically base64 encoded encrypted serialized protocol
buffers. Without the key, you won’t get anything out of them, and the
keys rotate on a schedule. For what’s actually in them, it’s probably
overkill, but better safe than sorry.

They also don’t contain the IP address.
:
As Grant pointed out, we consider the IP address of the user to be PII and
do not share it in most cases.

https://www.mail-archive.com/mailop@mailop.org/msg08419.html

Notice also the contradiction: the header seems not to carry the originating IP address. It seems that at a point in time this was indeed carrying the source IP, but something (maybe legislation?) changed and the contents of the header changed too, without the documentation being helpfully updated.

deletePad: delete forgotten etherpad pads

Etherpad is a fantastic tool for collaboration and text sharing among colleagues. It’s docker container can be easily deployed in a Kubernetes cluster and you can tie it to a database for some persistence if you like.

However this persistence can prove problematic sometimes, as you may accidentally share stuff that you shouldn’t, or share stuff for longer than you should. For this reason, it is fairly easy to use the Etherpad API and implement a job that deletes old pads, left unedited after some time. And this is what I did:

Yes, there are Etherpad plugins that do the same, but I did not want to mess around with my deployment and add one more ConfigMap to support the settings for Etherpad and the like.


Originally, and because I link Etherpad to a Postgres, I was expiring old pads using pg_cron, but this was not the cleanest of solutions, because until restaring the Etherpad, the pad remained in the process’s memory and was serviceable, even though not on the database. And that is why I resorted to using the API.

microk8s, nginx ingress and X-Forwarded-For

Sometimes when you run microk8s in AWS, you may want to have an application load balancer in front. Such configurations mess around with the value of the header X-Forwarded-For regardless of whether the append attribute is present on the load balancer. By reading the nginx ingress documentation, you need to edit the ConfigMap resource and add proxy-real-ip-cidr and use-forwarded-headers. You may also set compute-full-forwarded-for.

It only remains to figure out the name of the ConfigMap when ingress is installed with microk8s enable ingress. It is named nginx-load-balancer-microk8s-conf :

apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-load-balancer-microk8s-conf
namespace: ingress
data:
proxy-real-ip-cidr: 172.31.0.0/16
use-forwarded-headers: "true"

Can I have a serviceless Ingress

I wanted to implement a 302 redirect the other day. Normally you run a nginx (or other web server) where you setup the server configuration and give a directive like

server {
listen 0.0.0.0;
server_name _;
root /app;
index index.htm index.html;
return 301 $scheme://new_web_server_here$request_uri;
}

So I set about doing that, but thought that it would mean I am running an extra pod, without much need to do so. Would it be possible to run it via the Ingress controller directly? Yes it is possible to do so, since if you do not specify a backend in the nginx ingress controller, it falls back to the default backend and you can affect the ingress behavior with entering a code snippet:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: not-nginx
annotations:
nginx.ingress.kubernetes.io/server-snippet: |
return 302 https://new_web_server_here$request_uri;
spec:
tls:
- hosts:
- old_name_here
secretName: secret_name_here
rules:
- host: old_name_here

(Do not copy-paste, check indentation as sometimes WordPress mangles it)

Depending your setup, you may need to run a separate nginx instance, as snippets can create a security issue in a cluster where many users can configure Ingress objects.

I’d rather deploy on a Sunday

Originally posted on LinkedIn, but also saved here in hopes of longer posterity:

When you deploy you change the behavior of the system. You wouldn’t be deploying if you didn’t want to change it in the first place.

I don’t deploy on Fridays, not because I’m afraid of the technical implications -we know for years how to manage these- of a deployment gone wrong, but of the business implications and outsider systems dependencies that operate in degraded mode during the weekend. It is not whether the Ops or Dev on call can handle the things. It is whether the other side can and the faith you have they can. Or that you (or they) have a business person on call.

I’d rather deploy on a Sunday

postgrest: Column of relation does not exist

If you are using postgrest and you are getting an error of the form:

Column 'column_name' of relation 'table_name' does not exist

Restart postgrest. You will notice that it says

Schema cache loaded

Which means that if you changed the table definition after postgrest started, it will not be able to write to it, unless restarted and re-reading the table definition.

I found out about it while trying to send graylog alerts to a Postgres database using HTTP Alert Notifications.

You can do even more interesting stuff if instead of postgrest you use logstash.

TIL: net/http/httptrace

I write a few lines of Go here and there every few months, therefore I am basically a hobbyist (even though serious work with Kubernetes demands more of it). Today a friend had a very valid question:

How can I know the source port of my HTTP request when using net/http?

Pavlos

This is a valid question. If I have a socket, I should know all four aspects of it (src IP, src port, dst IP, dst port) and protocol of course. But it turns out this is not quite possible with net/http and another friend suggested making your custom transport to have control over such unexported information.

I had a flash and I thought “It can’t be such that no-one has written something that traces an HTTP connection in Go!”. It turns I was right and net/http/httptrace is right there and you can get the information needed thanks to the net.Conn struct (pardon the missing error handling):

package main

import (
    "fmt"
    "net/http"
    "net/http/httptrace"
    "io/ioutil"
)

func main() {
    client := http.Client{}
    req, _ := http.NewRequest("GET", "https://jsonip.com", nil)
        
    trace := &httptrace.ClientTrace{
        GotConn: func(connInfo httptrace.GotConnInfo) {
            fmt.Printf("GotConn: %v %v\n", connInfo.Conn.LocalAddr(), connInfo.Conn.RemoteAddr())
        },
    }
  
    req = req.WithContext(httptrace.WithClientTrace(req.Context(), trace))
    res, _ := client.Do(req)
    resBody, _ := ioutil.ReadAll(res.Body)
    fmt.Printf("%s\n", resBody)
}

I get that this can be written in a better manner, but for now I am happy that I learned something new.

I’ve stopped using docker-compose

docker-compose is a very handy tool when you want to run multi container installations. Using a very simple YAML description, you can develop stuff locally and then push upstream whenever you feel something needs to enter the CI/CD cycle.

Sometimes, I’ve even used it on production, when I wanted to coordinate certain containers on a single VM. But lately I’ve stopped doing so. The reason is simple:

All that we do is ultimately going to be deployed in a Kubernetes cluster somewhere.

Given the above, there’s no need to maintain two sets of YAMLs, one for the docker-compose.yaml and one for the Kubernetes / helm manifests. Just go with Kubernetes from the beginning. Run a cluster on your local machine (Docker Desktop, microk8s, or other) and continue from there. Otherwise you risk running into the variation of works on my machine that is phrased like but it works with docker-compose. Well, there’s no docker-compose in production, why should there be on your machine? Plus you’ll get a sense of how things look like in production.

If you’re so much used to working with docker-compose, you can start a very crude transition by assuming that you have a single deployment and every container that you were to deploy is a side-car container to a single Pod. Afterall, just like a Pod, any docker-compose execution cannot escape a single machine (yes I know about Swarm). Then you can break it down to different deployments per container you want to run.

The above occured to me when I was trying to deploy some software locally, before deploying on Kubernetes, and tried to follow the vendor instructions for docker-compose. They failed and I lost quite some time trying to fix the provided YAML, and it dawned me: I do not need it. I need to test in Kubernetes anyway.

So there, stop using docker-compose when you can. Everyone will be happier.