The Principles of Scientific Management

I was intrigued to read “The Principles of Scientific Management” after reading Gene Woolsey’s “Real World Operations Research” and Bob Emiliani‘s “Lean behaviors” [in PDF]. I read the eBook version from eBooks.com (only to find out later that it is also available online at least here and here).

The book is old and it shows. The first part of the book, which focuses on the basic principles of scientific management, is highly interesting and sometimes makes one wonder why are we not taught such stuff:

Develop methods based on scientific study for each element of a man’s work, which will replace the old rule-of-thumb methods.
Scientifically select and then train, teach, and develop the workmen, whereas in the past they chose their own work and trained themselves as best they could.
Cooperate with the men so as to insure all of the work being done in accordance with the principles of the methods which have been developed.
There is an almost equal division of the work and the responsibility between the management and the workmen so that the managers apply scientific management principles to planning the work and the workers actually perform the tasks.

The second part though (examples of applications of scientific management by the author and his colleagues) is a little bit boring since its domain (handling pig iron) is way out of my interests. The Wikipedia page on Scientific Management includes heavy criticism on its application (which is not unfair). However, the author warns that it [scientific management] is a process that takes a long time to install and one should not try to implement it faster. Both Woolsey in his papers collection and Emiliani note that a lot of people have not totally understood the methods and this results in the criticism. Emiliani in particular notes that the managers’ need for short-term results undermines the whole set of ideas and leads to their misapplication.

All in all, it was not a waste of (bus) time, but if anyone is interested in such stuff, I would recommend they spend their time reading “Lean Behaviors“. More current, easier to read and to the point with regard’s to Taylor’s ideas (I always carry a printed copy of it in my bag).

upgrade

There seems to be a shift in people’s perception of what a “computer person” does. While in the old days it would go like this:

– So you are into computers, right?
– Yes
– I have this problem with Word …

These days it goes like this:

– So you are into computers, right?
– Yes
– There’s this guy in Facebook. Can we trace him so that …

Is this an upgrade or what?

Update: It seems that I am repeating my self yearly. Then it was a neigbour. This time a friend of a friend. And in between I have been asked a number of times.

Firefox and IPv6

So you’ve set up a tunnel with your favorite tunnelbroker (for example Tunnelbroker.net or SixXS) but Firefox still refuses to “see the light” and insists on an IPv4 web. You simply need to type about:config in the address bar and change network.dns.disableIPv6 to false.

Explaining “the zone”

This is how I try to explain “the zone” sometimes:

– Suppose that a task takes 60 minutes to complete. Can you break it to six ten-minute intervals and complete it in six days?

60-minute tasks that require you to be “in the zone” cannot complete in six or seven ten-minute intervals.

loop

for (i = 0; i < strlen(string); i++) {
  :
}

This is a C version of a loop that I recently bumped into while copy-pasting some code. In the above loop strlen(string) is called strlen(string) times instead of just once (unless the length of string changes while in the loop):

int len;
:
len = strlen(string);
for (i = 0; i < len; i++) {
  :
}

Unless the compiler detects this and optimizes the loop, this is very bad practice. Yes, most of the compilers do detect this, but this does not mean that the programmer must rely on the compiler's optimizations. What if it does not get optimized in the end?

This is not a rant about wasted CPU cycles. It is mostly a rant about laziness in our thinking.

→ The Slow Loop

“Information wants to be free”

Q11: Is it possible to know the algorithms used by Member States in the construction of their VAT identification numbers?

The European Commission cannot divulge these algorithms. However, the structure of VAT identification numbers is given in the table below.

As it happened I was looking for a method of verification of the Greek VAT identification numbers. All I knew was that it was a check digit algorithm. The above quote from the EC/VIES site shows, the algorithms are somehow “secret”. But as the old saying goes “Information wants to be free“. All it took was asking over at twitter and a bunch of links were sent to me describing the verification method. From them the one by @stsimb stands out as it points to open source code by GSIS (General Secreteriat of Information Systems).

So please member States, publish your algorithms. Obscurity only delays the innevitable information release.

Greenspun’s Tenth Rule and variations

For those who have not heard Greenspun’s Tenth Rule, it states that:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

By the way, Greenspun‘s rules 1 to 9 do not exist.

Seven months ago, during a discussion about Prolog, I asked Ozan S. Yigit to reformulate Greenspun’s tenth rule for Prolog. Oz replied:

Any sufficiently complicated modern program contains a buggy, informal implementation of prolog that casual observers confuse with lisp.

Just hours earlier I was basically a listener in a discussion that involved NoSQL. While clearly I am not a NoSQL advocate, I am no hater either, but what I heard lead me to the following reformulation of Greenspun’s rule, this time involving the relational model:

Those who blindly adopt #NoSQL will discover a variation of Greenspun’s tenth rule

I am sure that many other variations exist. In fact the Wikipedia page on Greenspun’s Tenth Rule contains a Prolog variation similar to Ozan’s and an Erlang version. So if you know of (or can make up) any other, please post it here (or somewhere).

An alternative to FEATURE(mailertable)

Using FEATURE(mailertable) one can instruct sendmail to route email for certain destination via a specific relay. A mailertable is essentially a static map that instructs sendmail where to route email for certain destinations ignoring DNS MX RRs (or other information). Example:

yahoo.com   smtp:[server.example.com]
yahoo.com.hk   smtp:[server.example.com]
yahoo.com.mx   smtp:[server.example.com]
yahoo.com.br   smtp:[server.example.com]
yahoo.com.cn   smtp:[server.example.com]
yahoo.com.sg   smtp:[server.example.com]

Why would one want to do that? Your customers may have been hit by a botnet and as a result your outgoing mail server may have sent enormous amount of spam. Since most high-profile mail hubs use some kind of reputation scheme on the IP addresses that contact them, it is quite probable that your outgoing mail server is experiencing delays, or worse denied delivery despite the fact that in the meantime you have done your best to stop the botnet and clear your queues. I know for it has happened to me.

A mailertable is a quick solution to route email through another mail server just for recipient domains that implement such policies. But it is far from perfect for the Postmaster has no way to know all the domains that Yahoo! Mail in the above example hosts in order to construct a mailer table. Luckily, when high-profile mail hubs (like Gmail, Yahoo! Mail and Hotmail) implement good patterns on their DNS MX RRs, a programmatic (instead of a static) solution can be deployed:

LOCAL_CONFIG
Kbestmx bestmx -T.TMP

LOCAL_RULE_0
R$+ < @ $+ > $*         $: $(bestmx $2 $: NOTFOUND $) $| $1 < @ $2 > $3
R$+.hotmail.com. $| $+ < @ $+ > $*      $#esmtp $@ [server.example.com] $: $2 < @ $3 > $4
R$+ $| $+ < @ $+ > $*   $: $2 < @ $3 > $4

In the above snippet, any email that is directed to a domain that is served by Hotmail’s servers is routed via server.example.com. For the record, our outgoing webmail server achieved a senderscore of 50, and although a filter stopped the plaque, Hotmail silently discarded email originating from it. Using the above solution restored communications for our users.

New eBooks on Graph Theory

My twitter stream and my INBOX brought to my attention two new books on Graph Theory:

“Graph Theory and Complex Networks: An Introduction” by Maarten van Steen. It is very interesting to note that this book is also available electronically as a personalised PDF. As the author notes: “When you write a book containing mathematical symbols, thinking big and acting commercially doesn’t seem the right combination. I merely hope to see the material to be used by many students and instructors everywhere and to receive a lot of constructive feedback that will lead to improvements. Acting commercially has never been one of my strong points anyway”.
The other book is the fourth edition of Reinhard Diestel’s “Graph Theory“. This book is also available electronically in different formats. I bought the student edition for €12.50 (offer expires in Aug 15, 2010).

PS: On a side-note I decided to buy a BeBook Mini

Using bestmx for discarding outgoing email

The following ruleset discards email that originates from domains for which we are not best MX. It is meant to be applied on outgoing email servers:

LOCAL_CONFIG
Kbestmx bestmx -T.TMP

LOCAL_RULESETS
SLocal_check_mail
R$*                               $: $>canonify $1
# You may (or may not) want to comment the following line
R < @ >                           $#OK
R$* < @ $+. > $*          $1 < @ $2 > $3
R$* < @ $+ > $*                   $: $2
# Short circuit certain domains (and host names)
Rexample.com                           $#OK
R$* . example.com                      $#OK
R$*                               $: $(bestmx $1 $: NO $)
# If a temporary error occurs, do not block
R$*.TMP                           $#OK
Rserver.example.com.          $#OK
R$*                               $#discard $: $1

This works for as long as spammers do not use domains for which they do not control the DNS zones. If they do control the DNS zones they can easily add your relays as MX to them. In such cases the above ruleset must be modified to lookup the name servers for domains that server.example.com is best MX and then decide to discard. However the above trick erased thousands of outgoing spams yesterday.

PS: Like I posted on twitter: I rewrote the above filter in ~35 lines of Perl (subroutine filter_sender for MIMEDefang’s mimedefang-filter). The sendmail version is both more compact and readable (at least to me).