arfparse – a simple tool to extract ARF information

arfparse is a utility used to parse mailbox archives and extract ARF information, as described in RFC 5965An Extensible Format for Email Feedback Reports“.

It is meant to work as a preliminary processor, therefore output of the program is kept as simple as possible. Example usage:

$ arfparse -m ~/mail/

This will extract ARF information sent from assuming the FBL reports are archived in ~/mail/

arfparse is developed on OpenBSD with Panda-IMAP and should work with UW-IMAP too. It is the product of structured procrastination.

You can grab arfparse from GitHub.

Feel free to send me flames, suggestions and improvements.

PS: Yes, I would post about arfparse in the comments section here, but comments seem to be locked for now.

c-client callbacks

* This is mostly for personal copy-paste reasons

Those who take the time to develop applications using UW-IMAP (or Panda IMAP) know that there are a number of callbacks that need to be defined. What follows is the simplest (do nothing) version of them.

#include "c-client.h"

mm_flags(MAILSTREAM *stream,unsigned long number) {

mm_status(MAILSTREAM *stream,char *mailbox,MAILSTATUS *status) {

mm_searched(MAILSTREAM *stream,unsigned long number) {

mm_exists(MAILSTREAM *stream,unsigned long number) {

mm_expunged(MAILSTREAM *stream,unsigned long number) {

mm_list(MAILSTREAM *stream,int delimiter,char *name,long attributes) {

mm_lsub(MAILSTREAM *stream,int delimiter,char *name,long attributes) {

mm_notify(MAILSTREAM *stream,char *string,long errflg) {

mm_log(char *string,long errflg) {

mm_dlog(char *string) {

mm_login(NETMBX *mb,char *user,char *pwd,long trial) {

mm_critical(MAILSTREAM *stream) {

mm_nocritical(MAILSTREAM *stream) {

mm_diskerror(MAILSTREAM *stream,long errcode,long serious) {

mm_fatal(char *string) {


Since Mark Crispin left the UW, development on the UW-IMAP toolkit paused. Mark however continued developing the toolkit under the name Panda-IMAP.

Panda-IMAP is not publicly available. Mark Crispin allows access to it to people (and organizations) that donate to the development of the project. Since I am a dedicated user of the UW-IMAP toolkit and time had passed since the last version of the “old” UW-IMAP toolkit (back in 2008) on April 23, 2010 I personally donated $100 to the project. Replacing UW-IMAP with Panda-IMAP was a piece of cake and given that we are planning to move to mix format mailboxes, I am extremely happy with the result of using Panda-IMAP so far.

ΥΓ: Ερώτημα για όσους υπερασπίζονται το άστοχο tweet της κυρίας Άννας Διαμαντοπούλου: Τα παραπάνω σημαίνουν πως το κόστος του mail server είναι $100, ή όποιο άλλο ποσό αποφάσιζα να δώσω από την τσέπη μου;

UW-IMAP utilities and restrictBox mode

Note to self:

When using the UW-IMAP toolkit in restrictBox or closedBox modes, or even with local patches, it is helpful to have a “vanilla” version of the utilities around, for they may not work as expected. It took me a while to figure out why

mailutil prune `pwd`/Trash “before 18-may-2008”

was not working as expected. Our local version was linked with a c-client.a with restrictBox = -1 and a local version of getpwnam(3).

so long, and thanks for all the fish!

Two days ago Mark Crispin wrote in imap-protocol:

I was laid off today. Unfortunately, I didn’t get a change to push imap-2007b out the door in release status, but the development tarball there is pretty close to my final bits.

If you have support requests for UW imapd, please send them to the Alpine development team at UW, alpine-contact at

It has been a privilege to work with all of you for the past 20 years.

— Mark —
Science does not emerge from voting, party politics, or public debate.
Si vis pacem, para bellum.

Why on earth would anyone want to lay off Mark Crispin, is a mystery to me. As a long time user of the UW-IMAP toolkit I want to thank MRC for his work and software, which solved many of my problems and preserved much of my time.

More fun with message threading

When I try to write email-related code and the result fails my expectations, I use my plan B: Write it in c-client. I suppose the fact that I do not start with c-client from the beginning is a result of suffering from the Not Invented Here Syndrome.

The other day I was trying to decipher the semantics of Thread-Index: and Thread-Topic: since it seems that Microsoft has not placed any public information on them. Apostolos suggested that Thread-Index: takes BASE64 values, to which I replied negatively. After all, decoding


using perl -MMIME::Base64 -ne ‘print decode_base64($_);’ does not produce anything meaningful.

However I dug a little bit more, following this piece of advice from the imap-protocol list:

“Look at the evolution source code, it contains quite a bit of
information on this.”

camel-exchange-folder.c from the Evolution Exchange package reveals the following gem:

/* A new post to a folder gets a 27-byte-long thread index. (The value
 * is apparently unique but meaningless.) Each reply to a post gets a
 * 32-byte-long thread index whose first 27 bytes are the same as the
 * parent's thread index. Each reply to any of those gets a
 * 37-byte-long thread index, etc. The Thread-Index header contains a
 * base64 representation of this value.

[ Update: Message Threading in Public Folders ]

Enough with trying to work with Thread-Index: then! JWZ has documented a very effective algorithm for message threading and c-client implements it (read docs/draft/sort.txt from the source code distribution):

spg = mail_newsearchpgm();
thr = mail_thread(ds, "REFERENCES", NIL, spg, 0);
walk_thread(thr, NIL);

(You are advised to read docs/internal.txt.)

The “REFERENCES” argument to mail_thread() instructs it to use jwz’s algorithm. The other option is to use “ORDEREDSUBJECT” (or as draft-ietf-imapext-sort-19.txt calls it: “poor man’s threading”). walk_thread() just prints the edges of the graph (actually it is a tree):

walk_thread(THREADNODE *thr, THREADNODE *prev)
        if (thr) {
                if (prev) {
                        printf("%d %d\n", prev->num, thr->num);

                if (thr->next) {
                        walk_thread(thr->next, thr);
                } else {
                        printf("%d NIL\n", thr->num);

                if (thr->branch) {
                        walk_thread(thr->branch, prev);


You may wish to use the output of the above routine (slightly modified) and feed it to dot, so that you can have an image display of the threads in the email collection that you study.

What is left to discuss a little bit more, is the THREADNODE structure: You can go from a THREADNODE to its first child via the next pointer (thr->next in the above example). If the THREADNODE has two children, then the second is a branch of the first (thr->next->branch). It if has three, the third is a branch of the second child (thr->next->branch->branch) and so on.

Migrating thousands of mailboxes to a new mailbox storage

Years ago we were faced with the following situation: We had thousands of mailboxes that were being served by a proprietary and unsupported version of email software. So we decided to move to a new architecture. While researching I selected a very good (IMHO) commercial product. But for reasons outside the scope of the post I opted for a F/OSS solution, and specifically the UW-IMAP toolkit. For those rushing to judge that price was the decisive factor I only have to reiterate Vladimir Butenko‘s words:

“Bottom line: you always pay. You need a simple thing – you pay a small amount, you need a big thing – you pay more.” (plain message and comp.mail.imap thread)

OK and now for the real question: How do you move thousands of mailboxes without your users noticing?

Since you do not know (and do not want to know) your users’ passwords the only thing you can do is to hack into the source code of your POP3 server (we do not offer IMAP yet). Whether it is the UW-IMAP toolkit, Cucipop, popa3d or any other server that you have access to its source, the server knows the correct password for your users when they authenticate. So when the authentication succeeds you fork(2) a program that logs into the old server and fetches the mailbox from the old server to the new one that runs the F/OSS software that you have selected. This can be fetchmail or any similar program. However I have found out that Net::POP3 is a better choice:

use Sys::Syslog;
use Net::POP3;
use DB_File;
$host = shift or die;
$user = shift or die;
$pass = shift or die;
tie %d, 'DB_File', "/etc/pop3.users.db", O_CREAT|O_RDWR, 0640, $DB_BTREE or die;
if ($d{$user}) {
        untie %d;
        exit 0;
$pop = Net::POP3->new($host) or die;
if ($pop->login($user, $pass) >= 0) {
        openlog("pop3cat-tmail", 'pid', 'mail') or die;
        syslog('info', 'fetching mail for user %s', $user);
        $msgnums = $pop->list or die;
        foreach $msgnum (keys %$msgnums) {
                $msg = $pop->get($msgnum);
                open T, "| /usr/bin/tmail $user" or die;
                print T @$msg;
                close T;
        $d{$user} = "OK";
        untie %d;
exit 0;

Where exactly in the server code you fork(2), exec(3) and wait(2) for the script depends on the source code. You need to find where in the server code the authentication procedure succeeds and patch from there, before the server actually opens the user’s mailbox.

This script basically checks whether it has already moved a user’s mailbox from the old server. If it has, then the user is found in /etc/pop3.users.db and the script exits. If not, then the mailbox is moved and the user is inserted in /etc/pop3.users.db. Simply using Net::POP3 allows you to move the old mailbox to the Unix traditional mailbox format. That is why we fork tmail, since I have chosen the mbx mailbox format to store messages (I compile UW-IMAP with CREATEPROTO=mbxproto). If your server supports a format like Maildir, then you have to customize the above script accordingly, since it is written for mbx, which means that the user’s mailbox is a single file.