upgrade

There seems to be a shift in people’s perception of what a “computer person” does. While in the old days it would go like this:

– So you are into computers, right?
– Yes
I have this problem with Word

These days it goes like this:

– So you are into computers, right?
– Yes
– There’s this guy in Facebook. Can we trace him so that …

Is this an upgrade or what?

Update: It seems that I am repeating my self yearly. Then it was a neigbour. This time a friend of a friend. And in between I have been asked a number of times.

Explaining “the zone”

This is how I try to explain “the zone” sometimes:

– Suppose that a task takes 60 minutes to complete. Can you break it to six ten-minute intervals and complete it in six days?

60-minute tasks that require you to be “in the zone” cannot complete in six or seven ten-minute intervals.

loop

for (i = 0; i < strlen(string); i++) {
  :
}

This is a C version of a loop that I recently bumped into while copy-pasting some code. In the above loop strlen(string) is called strlen(string) times instead of just once (unless the length of string changes while in the loop):

int len;
:
len = strlen(string);
for (i = 0; i < len; i++) {
  :
}

Unless the compiler detects this and optimizes the loop, this is very bad practice. Yes, most of the compilers do detect this, but this does not mean that the programmer must rely on the compiler's optimizations. What if it does not get optimized in the end?

This is not a rant about wasted CPU cycles. It is mostly a rant about laziness in our thinking.

The Slow Loop

“Information wants to be free”

Q11: Is it possible to know the algorithms used by Member States in the construction of their VAT identification numbers?

The European Commission cannot divulge these algorithms. However, the structure of VAT identification numbers is given in the table below.

As it happened I was looking for a method of verification of the Greek VAT identification numbers. All I knew was that it was a check digit algorithm. The above quote from the EC/VIES site shows, the algorithms are somehow “secret”. But as the old saying goes “Information wants to be free“. All it took was asking over at twitter and a bunch of links were sent to me describing the verification method. From them the one by @stsimb stands out as it points to open source code by GSIS (General Secreteriat of Information Systems).

So please member States, publish your algorithms. Obscurity only delays the innevitable information release.

Greenspun’s Tenth Rule and variations

For those who have not heard Greenspun’s Tenth Rule, it states that:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

By the way, Greenspun‘s rules 1 to 9 do not exist.

Seven months ago, during a discussion about Prolog, I asked Ozan S. Yigit to reformulate Greenspun’s tenth rule for Prolog. Oz replied:

Any sufficiently complicated modern program contains a buggy, informal implementation of prolog that casual observers confuse with lisp.

Just hours earlier I was basically a listener in a discussion that involved NoSQL. While clearly I am not a NoSQL advocate, I am no hater either, but what I heard lead me to the following reformulation of Greenspun’s rule, this time involving the relational model:

Those who blindly adopt #NoSQL will discover a variation of Greenspun’s tenth rule

I am sure that many other variations exist. In fact the Wikipedia page on Greenspun’s Tenth Rule contains a Prolog variation similar to Ozan’s and an Erlang version. So if you know of (or can make up) any other, please post it here (or somewhere).

An alternative to FEATURE(mailertable)

Using FEATURE(mailertable) one can instruct sendmail to route email for certain destination via a specific relay. A mailertable is essentially a static map that instructs sendmail where to route email for certain destinations ignoring DNS MX RRs (or other information). Example:

yahoo.com   smtp:[server.example.com]
yahoo.com.hk   smtp:[server.example.com]
yahoo.com.mx   smtp:[server.example.com]
yahoo.com.br   smtp:[server.example.com]
yahoo.com.cn   smtp:[server.example.com]
yahoo.com.sg   smtp:[server.example.com]

Why would one want to do that? Your customers may have been hit by a botnet and as a result your outgoing mail server may have sent enormous amount of spam. Since most high-profile mail hubs use some kind of reputation scheme on the IP addresses that contact them, it is quite probable that your outgoing mail server is experiencing delays, or worse denied delivery despite the fact that in the meantime you have done your best to stop the botnet and clear your queues. I know for it has happened to me.

A mailertable is a quick solution to route email through another mail server just for recipient domains that implement such policies. But it is far from perfect for the Postmaster has no way to know all the domains that Yahoo! Mail in the above example hosts in order to construct a mailer table. Luckily, when high-profile mail hubs (like Gmail, Yahoo! Mail and Hotmail) implement good patterns on their DNS MX RRs, a programmatic (instead of a static) solution can be deployed:

LOCAL_CONFIG
Kbestmx bestmx -T.TMP

LOCAL_RULE_0
R$+ < @ $+ > $*         $: $(bestmx $2 $: NOTFOUND $) $| $1 < @ $2 > $3
R$+.hotmail.com. $| $+ < @ $+ > $*      $#esmtp $@ [server.example.com] $: $2 < @ $3 > $4
R$+ $| $+ < @ $+ > $*   $: $2 < @ $3 > $4

In the above snippet, any email that is directed to a domain that is served by Hotmail’s servers is routed via server.example.com. For the record, our outgoing webmail server achieved a senderscore of 50, and although a filter stopped the plaque, Hotmail silently discarded email originating from it. Using the above solution restored communications for our users.

New eBooks on Graph Theory

My twitter stream and my INBOX brought to my attention two new books on Graph Theory:

  • Graph Theory and Complex Networks: An Introduction” by Maarten van Steen. It is very interesting to note that this book is also available electronically as a personalised PDF. As the author notes: “When you write a book containing mathematical symbols, thinking big and acting commercially doesn’t seem the right combination. I merely hope to see the material to be used by many students and instructors everywhere and to receive a lot of constructive feedback that will lead to improvements. Acting commercially has never been one of my strong points anyway”.
  • The other book is the fourth edition of Reinhard Diestel’s “Graph Theory“. This book is also available electronically in different formats. I bought the student edition for €12.50 (offer expires in Aug 15, 2010).

PS: On a side-note I decided to buy a BeBook Mini

Using bestmx for discarding outgoing email

The following ruleset discards email that originates from domains for which we are not best MX. It is meant to be applied on outgoing email servers:

LOCAL_CONFIG
Kbestmx bestmx -T.TMP

LOCAL_RULESETS
SLocal_check_mail
R$*                               $: $>canonify $1
# You may (or may not) want to comment the following line
R < @ >                           $#OK
R$* < @ $+. > $*          $1 < @ $2 > $3
R$* < @ $+ > $*                   $: $2
# Short circuit certain domains (and host names)
Rexample.com                           $#OK
R$* . example.com                      $#OK
R$*                               $: $(bestmx $1 $: NO $)
# If a temporary error occurs, do not block
R$*.TMP                           $#OK
Rserver.example.com.          $#OK
R$*                               $#discard $: $1

This works for as long as spammers do not use domains for which they do not control the DNS zones. If they do control the DNS zones they can easily add your relays as MX to them. In such cases the above ruleset must be modified to lookup the name servers for domains that server.example.com is best MX and then decide to discard. However the above trick erased thousands of outgoing spams yesterday.

PS: Like I posted on twitter: I rewrote the above filter in ~35 lines of Perl (subroutine filter_sender for MIMEDefang’s mimedefang-filter). The sendmail version is both more compact and readable (at least to me).