Polymath projects for other disciplines too?

While I was revisiting Gowers‘ “Mathematics: A Very Short Introduction” my mind wandered to the first Polymath project (essentially a massively collaborative effort to solve certain mathematics problems where participation seemed to follow the 90-9-1 principle). Anyone who wants to learn more about Polymath can start from “A gentle introduction to the Polymath project

Anyway, as I was reading the paragraph I was looking for, it struck me: Do other disciplines have similar efforts? Wouldn’t it be nice if they did? If not, why? One minute later a second strike came:

– Wait a minute! We were there before Polymath! We have Hackathons!

Although more free spirited (in a Hackathon anyone can tackle what they want) the outcome is to the benefit of the society concerned with the event.

However hackathons seem disconnected from academic enviroments and it is a pitty. Big conferences occur yearly and people have fun discussing their work at the hallway tracks exchanging ideas and strategies. It seems a bit of waste that so many bright minds together do not sit around a blackboard, or even collaboratively over the Net, and discuss about attacking a problem, any problem, that has endured the test of time. Bright Math people did it, why not the rest?

With HDMS approaching, maybe this is something to consider for the last session. So if anyone from those going to Cyprus is reading, keep this at the back of your head. I could be wrong and such an effort may not be feasible in another discipline, but I would like to know why.

on picking an MTA

Sometimes I get asked on what is my MTA (Mail Transfer Agent) of choice. Almost always I am asking “What do you want to do with it?”. Personally, in most places I install sendmail. There are cases (cases where one would use FEATURE(nullclient) or similar) where I install nullmailer, for I find it unnecessary to run sendmail.

People sometimes ask me why do I choose sendmail and not Postfix (or Qmail in the old days) or even Exim since we are running a mostly Debian shop. Leaving the monolithic argument aside (which is kind of funny when most people that use it are using a monolithic kernel OS anyway) I am using sendmail because of its expressive power. I can find a way to express what I am thinking (filtering, routing, etc) in its modem noise of a programming language or milters like MIMEDefang (IIRC, there’s a wonderful PDF presentation by Ricudis on the Turing completeness of the sendmail.cf language but I have no link to it).

It is not that I have not used other MTAs. Hell, I was even running Postfix alpha versions right after it was renamed from VMailer. And occasionally I am running MeTA1 instances. But I always return to sendmail. If it does not suit you, it is OK. Pick the one MTA that can help you build the setup that you have in mind, be it Exim, Postfix, netqmail, commercial software like Exchange or CommuniGate, whatever. If it works for you and your team, then it is the right choice. Endless debates are for people who have too much free time.

However, if there is one recommendation that I can share, this is it: If you are serious about email (routing) invest some time reading the bat book. He who can understand a complex piece of software like sendmail, can guide himself through any email system.

(triggered by a brief conversation I had with a friend this afternoon)

tldcheck – a script to check domain availability on all TLDs and ccTLDs

@dstergiou asked:

I am looking for a tool to see under which TLDs a domain is registered. E.g. check all TLDs for domains matching “test123”

One way to check is to visit registrars that cover TLDs and ccTLDs (like EuroDNS or Gandi) and submit your query to their forms. However since (at least TTBOMK) none of them offers any query API to check programmatically, one can script his way to an almost complete solution. ICANN provides a list of all available TLDs and we can iterate over it:


#!/usr/bin/perl

## To grab the TLDs: wget -c http://data.iana.org/TLD/tlds-alpha-by-domain.txt

use Net::DNS;

$check = shift or die;

open F, "< tlds-alpha-by-domain.txt" or die;
while (<F>) {
        next if (m/^#/);
        chop;
        push @tlds, $_;
}
close F;

foreach $i (@tlds) {
        my $domain = $check . "." . $i . ".";
        my $res = Net::DNS::Resolver->new;
        my $query = $res->query($domain, "NS");

        if ($query) {
                ## foreach $rr (grep { $_->type eq 'NS' } $query->answer) { print $rr->nsdname, "\n"; }
                print $domain, "\n";
        }
}

The above hack comes with three problems:

  • It uses Net::DNS which makes it a little slow. There is room for improvement.
  • It finds out about registered domain names that have domain name servers that answer queries about them. This is not always the case. For example in .GR one can register a domain without having it served.
  • It also does not take into account strict domain hierarchies, like .co.uk, .net.uk, .org.uk but minimal changes are needed to test that too.

Somewhat incomplete as a solution, but adequate as a 5 minute hack for most purposes.

P.S. @stsimb offers a similar solution using dig:

for sfx in TLDs; do dig +short ns test123.$sfx; done

Moacyr Barbosa

“Only three people have, with just one motion, silenced the Maracanã: Frank Sinatra, Pope John Paul II and me.”Ghiggia about his goal.

Moacyr Barbosa lived the rest of his life as an outcast. In his own words “The maximum punishment in Brazil is 30 years imprisonment, but I have been paying, for something I am not even responsible for, by now for 50 years”

Robert Green will be fried by the English press, but an outcast he will not become.

(Inspired by today’s goal, scored by Dempsey on Green)

on team formation

In a meeting today a friend (quietly) observed that opening a process to a wider audience very fast may compromise the very process that interests the intended audience. I replied back with the thesis that:

“Whenever data increases, quality drops (for any quality metric)”

I first heard that thesis 15+ years back in a meeting about data warehouse quality. Usually when few people get together for a certain task it goes like this:

Small team, with people working towards similar goals

Increase the number of participants and you get something like this:

More people join the party, and things get interesting

Add a political twist and some power-play (personal or between organizations) and you get this:

Politics and power-plays set the project's final course (do nothing)

This is to be expected. David Alan Grier in “The Dictator and the Web Design” (IEEE/Computer, May 2009) notes:

“Traditional management theories identify such fights as the second part of a four-stage development process for small groups, the forming-storming-norming-performing steps that psychologist Bruce Tuckman identified in the 1960s. “Group members become hostile toward one another as a means of expressing their individuality and resisting the formation of group structure,” Tuckman claimed.

In Tuckman”s model, committee members must go through a period in which they express their objections to the collaboration in emotional terms (the storming stage) before they can learn to work together (norming) and actually accomplish their goals (performing).”

So there, as long as performance does not go “our way”, quality drops. By the way, this also explains why Panathinaikos B.C. prevails over Olympiacos B.C. in Greek A1. They both have excellent players, but Panathinaikos make sure that all are focused to the same direction. They are a team performing, while the others are still forming.

Χέστηκα για το Πρωτάθλημα

Περιμένω την ημέρα που κάποιος παράγοντας θα πάρει την ομάδα και θα φύγει από το γήπεδο (όπως κάποτε ο Βουλινός) και ας πάρει ο αντίπαλος το πρωτάθλημα. Είναι τουλάχιστον ηλίθιο να έχεις βάλει €35M σε μια ομάδα και να παρακαλάς να αδειάσει το γήπεδο.

Θυμάμαι πως ο πρώτος αγώνας που με πήγε ο πατέρας μου ήταν Ολυμπιακός – Ηρακλής (φιλικό 4-3 με Νόιμαν, Κουσουλάκη, Ορφανό, Παπαμιχαήλ) στη Θύρα 14 (με την 7 κλειστή για έργα, όλοι ήταν στη 14). Σκέφτομαι πως εγώ δεν έχω πια τη δυνατότητα να πάρω τα παιδιά μου στο γήπεδο, όχι στις “σκληρές” θύρες, αλλά ούτε και στις ήρεμες. Και αν σκεφτεί κανείς τι σημαίνει να πάει μια πενταμελής οικογένεια στο γήπεδο -και πόσα θα ακουμπήσει σε εισητήρια, φαγητό, branded merchandise κ.λπ.- θέλω να δω in the long run ποιος θα χάσει. Οι οικογένειες που δεν θα πάνε στο γήπεδο ή οι ομάδες που θα παίζουν σε άδεια (όχι από τιμωρία) γήπεδα;

(Το παιχνίδι δεν έχει αρχίσει ακόμα)

a bit of history on the relatonal model

Thanks to Software Memories we learn about David Childs and his work on Extended Set Theory. I quote from the blog post:

“Way back in 1968, Childs wrote a paper outlining how set theory, relations, and tuples could be applied to data management.

And that’s where I did a double-take, because 1968 < 1970. Sure enough, Footnote #1 in Codd’s seminal paper is to Childs’ 1968 work. Indeed, Childs’ paper is the only predecessor Codd acknowledges as having significant portions of his idea.”

It seems that there was life before God Codd after all.