I like milter-ahead a lot. But in our particular deployment it is not a best fit for it assumes that all the useful information for deciding whether to accept or reject email resides not on the server that it runs on, but in the servers that it queries. This is not milter-ahead’s fault. Milters have no way of expanding aliases while checking the recipient address so the programmer has to use tricks like parsing the output of sendmail -bv user@address thus running a second sendmail process for the same delivery. The alternative would be to hack milter-ahead to check with the alias database the existence of recipient addresses, but doing so the way sendmail reads the alias database is overly complex. One could also write an external daemon to monitor the alias database and inject entries in the (Berkeley DB) database maintained by milter-ahead, but that database is locked exclusively. And yes, exceptions could be entered in the access database, but that would mean maintaining two files for a single (and not so frequent) change in the alias files.
As I’ve blogged before, one of the reasons that I like MIMEDefang is that it gives the Postmaster a full programming language to filter stuff. By simply using md_check_against_smtp_server() a poor man’s non-caching version of milter-ahead is possible. Adding support to read the alias database (be it the text file or the hash table) is also trivial.
But what about the case of busy mail systems? You do not want to hammer your mail servers all the time with queries for which the answer is going to be constant for long periods of time. You need a caching mechanism. At first I thought of implementing such a mechanism the way milter-ahead does: By using a Berkeley DB database and some expiration mechanism, either from within MIMEDefang (retrieve the key and if it should have been expired by now delete it, otherwise proceed as expected) or by an external “garbage collecting” daemon. But such an interface with a clean way to enter keys and values already exists and performs well: memcached. So by using Cache::Memcached within the mimedefang-filter mimicking basic milter-ahead behavior (with caching) was done.
But what about the local aliases in the mail server? After all this was all the fuss that prompted the switch anyway. I wrote a Perl script that opened the alias database using the BerkeleyDB package. Two details need caution here:
- The first one is ignoring the invalid @:@ entry in the alias database. You do not see it in the alias text file, but you will see it when you run praliases. Sendmail uses this entry in order to know whether the database is up-to-date or not. See the bat book for a longer discussion of this.
- The second detail is that since the alias database is written by a C program, all strings are NULL terminated. This is not the case with strings that are used as keys and values with Perl and the BerkeleyDB package. However the Perl BerkeleyDB package provides for filters to deal with this case. You need something like:
$db->filter_fetch_key( sub { s/\0$// } );
And then there’s the issue of making such a script a daemon. One can go the traditional way, use a daemonizer on steroids or simply use Proc::Daemon::Init and be done with it.
memcached comes handy to storing key-value pairs in many system administration tasks and I think I’m going to use it a lot more in mail filtering stuff.