Programming languages I spend my time on

I do not remember the podcast episode where the guest mentioned something that stuck with me: JVM is the single piece of software that has been so thoroughly engineered. Exaggeration aside, the guest was mostly right. We deploy tons of stuff that runs on JVM and we have to turn a multitude of knobs (usually by copy-pasting from SO/SF) until it somehow works. That’s why I learned Groovy. To be able to write 10 lines of code that would run on the JVM.

Erlang’s BEAM is another platform that needs to be mentioned. It still does not have the adoption it should given that we now run distributed systems all the time and need to orchestrate stuff. We prefer to hit our hammers on Kubernetes instead. Maybe this is because of the Prolog-like feeling of Erlang. That’s why Elixir has been in my bucket list. I’ve not written a single line of code yet.

Golang is the obvious suspect when you’re paid to run stuff on Kubernetes. The combination is like C and Unix: Go and Kubernetes. There’s nothing more to add here.

LLVM is other thing to look into. It seems to be the compiler backend, especially when you’re not writing a compiler of your own. Guess what? Julia is the thing I’m looking into. At a point in time, you’re going to need something different than Python and Pandas or other combination. My bet is Julia. I have written 10 lines of code in it :)

Anything more exotic? Well, as I am approaching 50, I’m thinking of visiting APL. But not without a project at hand.

I could have invested all this time and learn a single language instead: C++

Why I like Groovy

I am writing this following a discussion with a colleague, where he pointed out his dislike of the Jenkins pipeline language and I commented that I liked Groovy. He sounded astonished, so this gives me a chance to elaborate a bit on that:

I run systems and I am not a Software Engineer, but I do write glue code all the time. The past few years I’ve come across a number of Jenkins installations, and Groovy is a bit of a required asset for more complex Jenkins stuff. That’s how I got my working, trial and error, knowledge of Groovy.

I happen to like functional languages. But since I am not a SWE and since in my location no FP jobs existed in the market, I am paid to do stuff people know I do well, not stuff I want to play with. To this end, Groovy is the closest thing to FP I can get paid working with. And it is a trick in life to find what is play that people think is work that want to pay you for.

On the same track, Groovy runs on the JVM. Why not Clojure you say? Because I can get paid working with Groovy, I can only be considered a junior engineer seeking Clojure work. And time is a valuable and constrained resource. I do not have infinite free time to learn Clojure. I did happen to be allowed to learn Groovy to save the day on a system.

Even though I first worked with Java when it run on SPARC Solaris 2.3 machines, I decided to not invest my time in the language, foreseeing that such an investment of time, will make me a monoglot, and I sure enjoy more my ability to switch languages, even for glue code and/or projects. But, again it happens that the JVM is one of the most engineered pieces of software of the past 25 years and almost everywhere you will find it running. So when you run systems you need to understand it somehow. And you will be required some times to write code that runs on it.

Hence my answer: Groovy and the book next to me on the desk.

It is not my first choice, but it seems to be the optimal for me. Today Groovy is number 12 on the TIOBE index and Go (my next similar choice because of Kubernetes) 14.

If you want to make a dent in the world…

I was reading the beginning of the History of Clojure (a language that I have no time to invest in learning, but still interesting to me):

I started working on Clojure in 2005, during a sabbatical I funded out of retirement savings. The purpose of the sabbatical was to give myself the opportunity to work on whatever I found interesting, without regard to outcome, commercial viability or the opinions of others. One might say these are prerequisites for working on Lisps or functional languages.

If you want to make any dent in the world, you either make it at a personal cost (not always monetary), or with other people’s money. Savings I have none. Self realizations are evident.

Wasting time with gawk while parsing lsof output

So I wanted to parse lsof, to see on what ports was a machine accepting connections. Normally one would write something like:

# lsof -Pn -i | grep LISTEN | awk '{print $9}' | cut -d: -f2 | sort -n | uniq
22
111
6066
7011
7015
7077
8080
10050
35735
37480
39118
44262
44444
52539

You get a sorted list of the open ports and are done with it. But why invoke four different programs to do extraction and sorting, when gawk is a complete programming language? Yes it is possible to do it with gawk in one go (and learn something in the process):

# lsof -Pn -i | awk '/LISTEN/ { split($9, a, ":"); b[a[2]] = 1; } END { n = asorti(b, c, "@ind_num_asc"); for (i = 1; i <= n; i++) { print c[i]; } }'
22
111
6066
7011
7015
7077
8080
10050
35735
37480
39118
44262
44444
52539

The /LISTEN/ effectively greps the lsof output for lines containing LISTEN and executes on them the code in curly braces to its right. Which splits the 9th column into an array using : as a delimiter. In awk arrays are indexed from 1 and the indices are strings (make a note of that).

END is a special match that executes the code in curly braces to its right after we’ve finished reading the input data. So, here is where the printing is done. Using the asorti() function we obtain a new array, indexed based on the values of the indices. We use @ind_num_asc to ensure that the order is 1, 5, 10, 15 and not 1, 10, 15, 5 as it would, should the indices be treated as strings. Finally, we can print the elements from the new array.

This would not be easily possible with awk / nawk, because as the gawk manual says:

In most awk implementations, sorting an array requires writing a sort() function. This can be educational for exploring different sorting algorithms, but usually that’s not the point of the program. gawk provides the built-in asort() and asorti() functions.

Somehow this reminds me of Knuth vs McIlory but of course I am neither.

cursive

Figlet, a program to make large letters out of ordinary text was featured in Lobsters and it reminded me of cursive, a similar program that we used back when ASCII art was the only decoration one could add to their email signature. And since it seems that cursive is not carried by any Linux distribution, I set out to find the sources. Which I did thanks to FreeBSD.

I downloaded the source and tried to compile it, and it required xstr. Yet another old program! I think the last time one can see it, is OpenBSD-5.5 manual. I mean even the 5.6 changelog writes:

“mkstr was intended for the limited architecture of the PDP 11 family.” Time moves on, memory gets cheaper. There’s no need for mkstr or xstr.

Hacks like xstr and mkstr might be of interest to embedded, or otherwise deprived, systems people as cool hacks and peculiarities, but really no actual need to maintain them.

So I set out to make the cursive source compilable with xstr again (it compiles without it just fine, just type make lcursive instead). And the minimally changed source code, to eradicate compiler warnings, of a program that can happily sign your email since 1985 is now on Github:

   __
  /  `
 /--   ____  o ____  ,
(___, / / (_/_(_) (_/___
           /       /
         -'       '

Shoutouts to my good friend Panagiotis C. who was the person who showed me cursive and figlet some 25 years ago or so.

In sed matching \d might not be what you would expect

A friend asked me the other day whether a certain “search and replace” operation over a credit card number could be done with sed: Given a number like 5105 1051 0510 5100, replace the first three components with something and leave the last one intact.

So my first take on this was:

# echo 5105 1051 0510 5100 | sed -e 's/^\([0-9]\{4\} \)\{3\}/lala /'
lala 5100

which works, but is not very legible. So here is taking advantage of the -r flag, if your modern sed supports it:

# echo 5105 1051 0510 5100 | sed -re 's/^([[:digit:]]{4} ){3}/lala /' 
lala 5100

So my friend asked, why not use \d instead of [[:digit:]] (or even [0-9])?

# echo 5105 1051 0510 5100 | sed -re 's/^(\d{4} ){3}/lala /' 
5105 1051 0510 5100

Why does this not work? Because as it is pointed in the manual:

In addition, this version of sed supports several escape characters (some of which are multi-character) to insert non-printable characters in scripts (\a, \c, \d, \o, \r, \t, \v, \x). These can cause similar problems with scripts written for other seds.

There. I guess that is why I still do not make much use of the -r flag and prefer to escape parentheses when doing matches in sed.

a newbie does list comprehensions

Formatting this post in WordPress.com was a great pain. It does not render correctly on some browser / device combinations, despite my rewrite efforts. So a Markdown copy of this post can be found as a gist here.


The year is 1998 and @mtheofy then at Glasgow tells me about a relatively new (then) language called Haskell. I’m intrigued but do not do much. A few years later I buy The Haskell School of Expression since The Craft of Functional Programming did not seem enough to motivate me. Time passes and around 2007 I try yet another start. Nothing. I promised my self yet another restart for a 2017 new year’s resolution. Still nothing. So when the current employer offered Haskell classes I could not say no. Armed with the weekly classes and a Safari Learning Path I am trying to correct this. And I am having some fun with list comprehensions. Because as a friend says, if it makes you feel good, go.

So how do you write an infinite list? Let’s say you want list x to include all numbers from 0 to infinity. stack ghci is my friend. Others might try repl.it:

x = [ n | n <- [0..]]

Now you can have the first 20 items of x:

Prelude> x = [ n | n <- [0..]]
Prelude> take 20 x
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
Prelude>

So next I wanted to make an infinite list of the same character. Enter the underscore variable:

Prelude> x = [ 'a' | _ <- [0..]]
Prelude> take 20 x
"aaaaaaaaaaaaaaaaaaaa"
Prelude>

OK, so now let’s try to cycle infinitely characters from a string. I end up with:

Prelude> x = [ c | i  take 20 x
"abcdabcdabcdabcdabcd"
Prelude>

I am kind of unsure why the let statements are needed since I am ~10 days into typing stuff and posted my creation to twitter. What my expression says is that x is comprised of characters from string “abcd”, where given a sequence of numbers, each time a character is chosen based on the sequence number modulo 4. Strings are lists of characters in Haskell and list indexing starts from zero.  Helpful comments come my way. Like the obvious cycle (there is a cycle function? Yes ):

Prelude> take 20 (cycle "abcd")
"abcdabcdabcdabcdabcd"
Prelude> take 20 $ cycle "abcd"
"abcdabcdabcdabcdabcd"
Prelude>

Is not the dollar operator nice to get rid of parentheses? Here is another suggestion about cycling a string:

Prelude> x = [ "abcd" !! (i `mod` 4) | i  take 20 x
"abcdabcdabcdabcdabcd"
Prelude>

This one is more concise and does the same thing, always picking a character from "abcd". If the infix notation for mod confuses you, you can:

Prelude> x = [ "abcd" !! (mod i 4) | i  take 20 x
"abcdabcdabcdabcdabcd"
Prelude>

But the Internet does not stop there. It comes back with more helpful suggestions:

Welcome! A little feedback then if I may: the !! operator should be used VERY cautiously it is not typesafe and lists are not random access anyway. Opt for a function returning Maybe x and for a random access datastructure (strings are by default lists).

Which made me think: How about an infinite string randomly chosen from “abcd”?

$ stack install random
$ stack ghci
:
Prelude> import System.Random
Prelude System.Random> g <- newStdGen 
Prelude System.Random> x = [ "abcd" !! i | i <- randomRs (0,3) g ]
Prelude System.Random> take 10 x
"bcbbddcdab"
Prelude System.Random>

If you want a sequence with a different order, you need to reinitialise both g and x. I will figure out a better way some other time when …I have time.

Adventures with Maybe maybe in another post.

Formatting this post in WordPress.com was a great pain.

resolutions

Last year I promised myself that I would revisit Haskell. Well I did not, so I did not escape the new year’s resolutions cliche. It was an interesting year though, considering that I left my country, worked for Intel, resigned and returned back to Greece and to my previous work.

So for this year I will promise myself something simpler, as a continuation of things I still do in 2017: simply improve my Go-fu. And yes, I also tried to learn Go and miserably failed. Let’s see about that too.

Parsing Techniques – A Practical Guide

Memory gets triggered in the most unexpected ways. I maintain a fairly large library of printed and electronic books (most of them DRMed -the light cases socially, kindle and Adobe locked the rest unfortunately) on subjects that interest me. It is fairly evident that I will not read them all, but I always have a book (and sometimes a paper) to recommend to a friend that has a problem. It seems that I am not the only one that thinks that personal libraries are supposed to be full of unread books.

Anyway, I was listening to Podcast.__init__ Episode 95 and one of the guests mentioned Parsing Techniques – A Practical Guide by Grune, I think it was when they touched Earley parsers and how most books about parsing do not really touch on how the actual parser is built. Wait a minute I’ve got that PDF! And you can go to the author’s site and download it. And you know what? There is a second edition out. For > 100 euros for a DRMed PDF I may not buy it since parsing is definitely not my thing, but somebody else out there might need the second edition. Judging from my skimming of the first edition, this is close to the encyclopaedia of parsing. I will go through some pages tonight.

Just for a refresher.