Greenspun’s Tenth Rule and variations

For those who have not heard Greenspun’s Tenth Rule, it states that:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

By the way, Greenspun‘s rules 1 to 9 do not exist.

Seven months ago, during a discussion about Prolog, I asked Ozan S. Yigit to reformulate Greenspun’s tenth rule for Prolog. Oz replied:

Any sufficiently complicated modern program contains a buggy, informal implementation of prolog that casual observers confuse with lisp.

Just hours earlier I was basically a listener in a discussion that involved NoSQL. While clearly I am not a NoSQL advocate, I am no hater either, but what I heard lead me to the following reformulation of Greenspun’s rule, this time involving the relational model:

Those who blindly adopt #NoSQL will discover a variation of Greenspun’s tenth rule

I am sure that many other variations exist. In fact the Wikipedia page on Greenspun’s Tenth Rule contains a Prolog variation similar to Ozan’s and an Erlang version. So if you know of (or can make up) any other, please post it here (or somewhere).

An alternative to FEATURE(mailertable)

Using FEATURE(mailertable) one can instruct sendmail to route email for certain destination via a specific relay. A mailertable is essentially a static map that instructs sendmail where to route email for certain destinations ignoring DNS MX RRs (or other information). Example:

yahoo.com   smtp:[server.example.com]
yahoo.com.hk   smtp:[server.example.com]
yahoo.com.mx   smtp:[server.example.com]
yahoo.com.br   smtp:[server.example.com]
yahoo.com.cn   smtp:[server.example.com]
yahoo.com.sg   smtp:[server.example.com]

Why would one want to do that? Your customers may have been hit by a botnet and as a result your outgoing mail server may have sent enormous amount of spam. Since most high-profile mail hubs use some kind of reputation scheme on the IP addresses that contact them, it is quite probable that your outgoing mail server is experiencing delays, or worse denied delivery despite the fact that in the meantime you have done your best to stop the botnet and clear your queues. I know for it has happened to me.

A mailertable is a quick solution to route email through another mail server just for recipient domains that implement such policies. But it is far from perfect for the Postmaster has no way to know all the domains that Yahoo! Mail in the above example hosts in order to construct a mailer table. Luckily, when high-profile mail hubs (like Gmail, Yahoo! Mail and Hotmail) implement good patterns on their DNS MX RRs, a programmatic (instead of a static) solution can be deployed:

LOCAL_CONFIG
Kbestmx bestmx -T.TMP

LOCAL_RULE_0
R$+ < @ $+ > $*         $: $(bestmx $2 $: NOTFOUND $) $| $1 < @ $2 > $3
R$+.hotmail.com. $| $+ < @ $+ > $*      $#esmtp $@ [server.example.com] $: $2 < @ $3 > $4
R$+ $| $+ < @ $+ > $*   $: $2 < @ $3 > $4

In the above snippet, any email that is directed to a domain that is served by Hotmail’s servers is routed via server.example.com. For the record, our outgoing webmail server achieved a senderscore of 50, and although a filter stopped the plaque, Hotmail silently discarded email originating from it. Using the above solution restored communications for our users.

Using bestmx for discarding outgoing email

The following ruleset discards email that originates from domains for which we are not best MX. It is meant to be applied on outgoing email servers:

LOCAL_CONFIG
Kbestmx bestmx -T.TMP

LOCAL_RULESETS
SLocal_check_mail
R$*                               $: $>canonify $1
# You may (or may not) want to comment the following line
R < @ >                           $#OK
R$* < @ $+. > $*          $1 < @ $2 > $3
R$* < @ $+ > $*                   $: $2
# Short circuit certain domains (and host names)
Rexample.com                           $#OK
R$* . example.com                      $#OK
R$*                               $: $(bestmx $1 $: NO $)
# If a temporary error occurs, do not block
R$*.TMP                           $#OK
Rserver.example.com.          $#OK
R$*                               $#discard $: $1

This works for as long as spammers do not use domains for which they do not control the DNS zones. If they do control the DNS zones they can easily add your relays as MX to them. In such cases the above ruleset must be modified to lookup the name servers for domains that server.example.com is best MX and then decide to discard. However the above trick erased thousands of outgoing spams yesterday.

PS: Like I posted on twitter: I rewrote the above filter in ~35 lines of Perl (subroutine filter_sender for MIMEDefang’s mimedefang-filter). The sendmail version is both more compact and readable (at least to me).

Panda-IMAP

Since Mark Crispin left the UW, development on the UW-IMAP toolkit paused. Mark however continued developing the toolkit under the name Panda-IMAP.

Panda-IMAP is not publicly available. Mark Crispin allows access to it to people (and organizations) that donate to the development of the project. Since I am a dedicated user of the UW-IMAP toolkit and time had passed since the last version of the “old” UW-IMAP toolkit (back in 2008) on April 23, 2010 I personally donated $100 to the project. Replacing UW-IMAP with Panda-IMAP was a piece of cake and given that we are planning to move to mix format mailboxes, I am extremely happy with the result of using Panda-IMAP so far.

ΥΓ: Ερώτημα για όσους υπερασπίζονται το άστοχο tweet της κυρίας Άννας Διαμαντοπούλου: Τα παραπάνω σημαίνουν πως το κόστος του mail server είναι $100, ή όποιο άλλο ποσό αποφάσιζα να δώσω από την τσέπη μου;

on picking an MTA

Sometimes I get asked on what is my MTA (Mail Transfer Agent) of choice. Almost always I am asking “What do you want to do with it?”. Personally, in most places I install sendmail. There are cases (cases where one would use FEATURE(nullclient) or similar) where I install nullmailer, for I find it unnecessary to run sendmail.

People sometimes ask me why do I choose sendmail and not Postfix (or Qmail in the old days) or even Exim since we are running a mostly Debian shop. Leaving the monolithic argument aside (which is kind of funny when most people that use it are using a monolithic kernel OS anyway) I am using sendmail because of its expressive power. I can find a way to express what I am thinking (filtering, routing, etc) in its modem noise of a programming language or milters like MIMEDefang (IIRC, there’s a wonderful PDF presentation by Ricudis on the Turing completeness of the sendmail.cf language but I have no link to it).

It is not that I have not used other MTAs. Hell, I was even running Postfix alpha versions right after it was renamed from VMailer. And occasionally I am running MeTA1 instances. But I always return to sendmail. If it does not suit you, it is OK. Pick the one MTA that can help you build the setup that you have in mind, be it Exim, Postfix, netqmail, commercial software like Exchange or CommuniGate, whatever. If it works for you and your team, then it is the right choice. Endless debates are for people who have too much free time.

However, if there is one recommendation that I can share, this is it: If you are serious about email (routing) invest some time reading the bat book. He who can understand a complex piece of software like sendmail, can guide himself through any email system.

(triggered by a brief conversation I had with a friend this afternoon)

sendmail load configuration

This post is about a neat trick that I have not seen many times discussed. According to the configuration README the default values for controlling load averages are:

  • confQUEUE_LA (QueueLA) Load average at which queue-only function kicks in. Default values is (8 * numproc) where numproc is the number of processors online (if that can be determined).
  • confREFUSE_LA (RefuseLA) Load average at which incoming SMTP connections are refused. Default values is (12 * numproc) where numproc is the number of processors online (if that can be determined).

However in “Sendmail Theory and Practice” (I am a proud owner of both editions) Paul Vixie and Fred Avolio propose a different approach:

“Astute readers will note that the value shown for Ox (QueueLA) is larger than the value shown for OX (RefuseLA), and that this is opposite from the configuration files you may have seen elsewhere. Setting them as shown here gives Sendmail a range of load average in which it is capable of delivering messages from its queue but incapable of receiving new messages. This is intentional. If you set Ox to be less than OX, Sendemail has instead a range of load average in which it can receive new mail (thus adding to the queue) but cannot deliver any queued mail. We believe that mail queues should become smaller or stay the same size when the load average is high. After watching our large mail gateway computers melt down many times over the years, we have learned that it is better to let other hosts’ mail stay where it was -on other hosts- when our load average is high, than to accept it even though we don’t plan to do anything with it until load average becomes low again.”

In other words although the defaults suggest otherwise, it may be wiser to have QueueLA > RefuseLA. This piece of advice is on both the 1995 (1st) and 2002 (2nd) editions of the book. A pearl that comes from 1995 that is still relevant.