Revisioning /etc with Git

First and foremost I’m a coder, a coder who strongly believes in revision control. Second I am also a sysadmin; but only by accident. I have some servers and someone has to take care of them. I’m not a good sysadmin because I don’t have any interest on it so I learn as little as possible and also because I don’t want to be good at it and get stuck doing that all the time. What I love is to code.

I’m always scare of touching a file in /etc. What if I break something? I decided to treat /etc the same way I treat my software (where I am generally not afraid of touching anything). I decided to use revision control. So far I’ve never had to roll back a change in /etc, but it gives me a big peace of mind knowing that I can.

In the days of CVS and Subversion I thought about it, but I’ve never done it. It was just too cumbersome. But DVCS changed that, it made it trivial. That’s why I believe DVCS is a breakthru. Essentially what you need to revision-control your /etc with Git is to go to /etc and turn it into a repository:

cd /etc
git init

Done. It’s now a repo. Then, submit everything in there.

git add .
git commit -am "Initial commit. Before today it's prehistory, after today it's fun!"

From now on every time you modify a file you can do

git add <filename>
git commit -m "Description of the change to <filename>"

where <filename> is one or many files. Or if you are lazy you can also do:

git commit -am "Description to all the changes."

which will commit all pending changes. To see your pending changes do:

git status

and

git diff

When there are new files, you add them all with

git add .

and if some files were remove, you have to run

git add -u .

to be sure that they are remove in the repo as well (otherwise they’ll stay as pending changes forever).

That’s essentially all the commands I use when using git and doing sysadmin. If you ever have to do a rollback, merge, branch, etc, you’ll need to learn git for real.

One last thing. I often forget to commit my changes in /etc, so I created this script:

#!/usr/bin/env sh

cd /etc
git status | grep -v "On branch master" | grep -v "nothing to commit"
true # Don't return 1, which is what git status does when there's nothing to do.

on /etc/cron.daily/git-status which reminds me, daily, of my pending changes.

Hope it helps!

Please Select your Language

I apparently speak Spain, United States, United Kingdom and Canada.

I would also like to speak Germany because it might be useful in Switzerland, you see, in Switzerland they speak a version of Germany, something like Swiss Germany.

As seen on http://easportsactive.com. And by the way, the reason why I was at their site is to try to figure out whether EA Sports Active, here in Switzerland at least, comes multilingual or not. From the box it seems to be only in German (or should I say Germany?), searching on-line I’ve found conflicting results. It seems EA Sports doesn’t dig multilingualism, they should support Esperanto to not have to deal with that problem (of course I’ve had to drop some Esperanto propaganda!).

The sad truth about testing web applications

There are many ways to test a web application. In the lowest level, we have unit tests; in the highest levels we have HTTP test, those that use the HTTP protocol to talk to running instance of your application (maybe running it on demand, maybe expecting it to be running on a testing server).

There are several ways to write HTTP tests. Two big families: with and without a web browser. Selenium is a popular way to write tests with a browser. A competing product is Web Driver which I understand can use a browser or other methods. If you’ve never seen Selenium before is pretty impressive. You write a tests that says something like:

  1. go to http://&#8230;
  2. click here
  3. click there
  4. fill field
  5. fill field
  6. submit form
  7. assert response

and when you run it you actually see a Firefox window pop up and perform that sequence amazingly fast. Well, it’s amazingly fast the first three runs, while you still have two tests or less. After that it’s amazingly slow, tedious, flaky and intrusive.

For the other family of tests, without a web browser, aside of Web Driver we have HttpUnitHtmlUnit and most of the Ruby on Rails testing frameworks. The headless solution tend to be faster and more solid, but the scenarios are not as realistic (only one JavaScript engine, if you are lucky, no rendering issues, like slowdowns, etc).

When you are testing, as soon as you touch the HTTP protocol everything becomes much harder and less useful. If you want to be totally confident a web application is working you need to test at the HTTP level, but the return-of-investment for those tests is very low: they are hard to write and not very useful.

Hard to write

They are hard to write because you are not calling methods with well-defined interfaces (list of arguments) but essentially calling one method HTTP-request, passing different parameters to get different results. You don’t have any code-completion, you don’t have any formal way to know which arguments to pass. Anything can be valid.

In a unit test you may have something like:

add_user("john");

when in a HTTP test you’ll have something like

http.send_request("/user/create", "username=john");

When you are writing a unit test, figure out the name of the add_user function and its arguments is easy. Some IDEs would autocomplete the name and show you the argument list. And if the name of add_user changes, some refactoring tools will even fix your tests for you.

But “/user/create” and “username=john” are strings. To figure them out you’ll have to know how your application handles routing, and how the parameters are passed and parsed. If your application changes from “/user/create” to “/user/add” the test will just break, and most likely, with a not-very-useful error message. Which takes into the next issue…

They are not very useful

They are not very useful because their failures are cryptic. When you write a test that calls method blah, which calls method bleh, which calls method blih, and then bloh and bluh and bluh divides by zero, you get an exception and a stack trace. Something like:

bluh:123: Division by zero! I can't divide by zero (I'm not Haskell)
bloh:234: bluh(...)
blih:452: bloh(...)
bleh:34: blih(...)
blah:94: bleh(...)
blah_test:754: blah(...)

You know that the test blah_test failed on line 754 when calling blah, which called bleh on line 94, which called blih on line 34, which called bloh on line 452 which called bluh on line 234 which dived by zero on line 123. You jump to bluh, line 123, and you may find something like:

a = i / 0;

where you replace the zero with something else; or most likely:

a = i / j;

where you have to track where j came from. Either it was calculated there or generated from another method and passed as an argument. The stack-trace gives you all the information you need to find where j was generated or where it came from. That’s a very useful test.

When you have HTTP in the middle, tests become much less useful. The stack trace of a failure would look something like:

http_request:123: Time out, server didn't respond.
blah_test:45: http_request(...)

That means that blah_test failed on line 45 making an http request call which failed with a timeout. Did your application divide by 0 and crashed? Did it try to calculate pi and it’s still doing it? Did it failed to connect to the database? Where did it actually fail? You don’t know. The only thing you know is that something went wrong. Time to open the log files and figure it out.

You open the log file and you find there’s not enough information there. You make the application log much, much more. So much that you’ll fill a terabyte in an hour. You run the test again and this time it just passes, no errors.

When you are at the HTTP level there are many, many things that are flaky and can go wrong. Let’s invent one example here: the web server you were using for the tests wants to DNS resolve everything it can. Every host name is resolved to the ip, and every ip is reverse-resolved to a name. When you run the test there was a glitch and your name servers were down. Now they are working correctly and they’ll never fail for another year. Good luck figuring it out from a time-out message.

The other way in which HTTP tests fail is something like this:

blah_test:74: Index out of bound for this array

You go to line 74 and it’s something like:

assert_equal("username", data[0]);

If data[0] caused an out-of-bound error, then the array data is empty. How can it be empty? It contains the response from the server and you know the server is responding with something usable because you are using the app right now.

What happened was that the log in box used to have the id, in HTML, "login" and it is now "log-in". That means the HTML parsing methods on blah_test don’t find the log in box and fail to properly fill the array data. Yet another case of tests exposing bugs, in the tests. And the real-life failures are much, much more complex like this.

My recommendation

All this makes the return of investment of writing HTTP tests quite low. They are very hard to write and they provide very little information when they fail. They do provide good information when they pass: if it works at the HTTP level, probably everything else works too.

I’d recommend any project not to write any HTTP test unless every other possible test, unit and integration, is already written.

Computer Science and Software Engineers

Joel Spolsky published yet another complaint about what they teach people to get a Computer Science degree. I think he is right in complaining that no university is producing the kind of programmers he wants, but he’s missing one point.

In Argentina there are two different careers related to programming: Computer Science, and Software Engineering. Computer Science, like its counterpart in USA produces scientist. Scientist are people not very much concerned about what’s practical or useful, but by advancing knowledge.

You don’t expect physicists (scientists) to build a bridge. Although they may understand all the forces at play, they don’t have the practical training. You have civil engineers that know how to build a bridge. Civil engineers, on the other hands, don’t play with subatomic particles, the beginning of the universe and black holes. They generally don’t advance knowledge, they build practical things

In the same vein, one should expect nothing else of a Computer Scientist than to use Haskell, push the advance of  type inference, experiment with artificial intelligence, dream of computers with a teracores (that is 1012 cores) and know nothing about deploying servers, Microsoft tools, etc. And you do expect a Software Engineering to know how to use Java, C#, Python or other current languages and never touch Haskell. They should also be able to organize themselves using agile or whatever to produce working practical products.

The Computer Science and Software Engineering careers in Argentina more or less reflect that. It’s not a clear cut but in CS you can find lessons in Artificial Intelligence while on Software Engineering you even find some lessons about laws. Not sure what they are about, but a Software Engineer does require some basic knowledge of licensing.

I thin Joel Spolsky and many others are right about complaining that universities don’t produce software engineers, but I think he is wrong about expecting them out of Computer Science departments. It would be very sad if Computer Science turns into Software Engineering and there’s nobody to dream of type inference and teracores.

150 years ago

150 years ago a great man was born. His name was Ludovic Lazarus Zamenhof and he was born to a world divided by language, a world of constant violence between polish, jews, russians, etc. All speaking different languages. He thought the problem of the world was that people could not understand each other and set himself the task of fixing it.

He invented what latter on became know as Esperanto. You can go to the Wikipedia and check the article on Esperanto and on Zamenhof to get a lot of encyclopedic information. If you want to actually taste or learn the language, my recommendation is to go to Lernu. And with that you can learn your first Esperanto word (if you don’t know any yet): lernu means learn, as in “you learn”. Lerni means to learn.

In this post I will tell you some things I find interesting about Esperanto.

Let’s go on with lerni. School is lernejo. See the relationship? lern – ej – o is school. Ej means a “a place for”, so lernejo is literarily a place to learn. There are other places like laborejo, which is the place to work. Laboro means work (think of ‘labor unions’).

Zamenhof thought about the task of creating the Esperanto dictionary and the task was so big he thought it was the end. Until he came up with the idea of allowing people to build words. My English-Esperanto, Esperanto-English dictionary is 75% for English, 25% for Esperanto. There are less words to learn in Esperanto.


Did you know the Wikipedia is available in Esperanto? If you go to wikipedia.org, you’ll see it among the languages with more than 100000 articles.

Esperanto Wikipedia

And if you go to the English wikipedia homepage, Esperanto is the only constructed language listed on the left column. Do you want to know something amazing? Vikipedio, the Esperanto Wikipedia is actually bigger than the Encyclopedia Britannica.

The legend goes that Zamenhof released his book about Esperanto, called La Unua Libro (the first book) and six months latter someone nocked at his door speaking Esperanto and asking to practice the language. Esperanto spread like wildfire, unlike any other constructed language.


Pasporta-servoToday it is estimated that there are 2 million Esperantists in the world. If you consider that 122 years ago there was only one Esperanto speaker, it’s growing quite fast. I would expect its growth is accelerating but it’s very hard to know. No census asks about Esperanto. I know someone that made a informal survey asking for people that spoke Esperanto on the streets of Zürich and then actually asking questions in Esperanto and he got 3% positive response.

Those 2 million speakers are not concentrated in one location, they are spread through the world so you are very likely to find Esperanto-speakers everywhere if you know where to look.

There are even an estimate of 1000 native Esperanto speakers. Basically that happens when a family is formed by a man and a woman who only share Esperanto as a common language. Even if they don’t actively teach their children Esperanto, they learn to be able to understand their parents. I know a couple of people that speak it natively.


When talking about how many people speaks the language, it’s important to mention that Esperanto speakers were hunted by many totalitarian goverments. The Nazi government specially targeted them because Zamenhof was jewish and according to Hitler as expressed in his My Fight, Esperanto was the language to be used by the International Jewish Conspiracy to set a new world order.

In the Soviet Union Esperanto was embraced at first. Most socialists parties saw the potential for international communication and understanding. Joseph Stalin saw it as a way to spread the ideals of communism until they realized that it was a two way street, new ideas would come from outside, including capitalism, and denounced Esperanto as the language of spies. Imperial Japan didn’t like the language either.

In all those cases of totalitarism, Esperanto was forbidden and Esperantists hunted, exiled or even executed.


The first Esperanto congress was held in 1905, bringing 600 people together from across the world. since then it was held every year except during the world wars with an average of 2000 participants. When it was done in China it was the biggest gathering of foreign people ever to happen in China.


There’s a very practical reason to adopt Esperanto. Currently we waste a lot of resources pretending English is an adequate medium of international communication and in translation. Let me give you one example. In 1975 the World Health Organization denied the following requests:

  • $ 148,200 to improve the health service in Bangladesh
  • $ 83,000 to fight leprosy in Burma
  • 50 cents per patient to cure trachoma, which causes blindness.
  • $ 26,000 to improve hygiene in the Dominican Republic

All those requests denied. It seems the World Health Organization didn’t have much money. But that same year they approved Arabic and Chinese as working languages requiring lots of translations and increasing the expenses of the WHO by $ 5,000,000 per year. That’s right, 5 million dollars per year spent on translation when they couldn’t give 50 cents to cure trachoma.


Esperanto is probably the easiest to learn usable language out there. The Institute of Cybernetic Pedagogy at Paderborn compared how long it would take French speaking people to learn different languages to reach the same level:

  • 2000 hours studying German
  • 1500 hours studying English
  • 1000 hours studying Italian
  • 150 hours studying Esperanto

Yes, a tenth of the time it takes to learn English and less than that when compared to German. And something very interesting happens here. The third language you learn takes less effort than the second one.

If you want to learn another language, let’s say, German, it’ll take you less time to learn Esperanto and then learn German than to just learn German. Yes, you’ve read right. Less time to learn two languages than one.

That experiment was done by teaching one year of Esperanto and four of French to some students while five of French to others. The amount of time studying was the same but those that spoke Esperanto first reached a better French level. So even if you never utter a single Esperanto word out there, it makes economical sense to learn it first, before you learn another language.


Many said that Esperanto will never take off and they proceed to never learn it and accept a divided broken world. If you are among those, I’m sorry about your defeat. I’d rather hope and do my part and learn Esperanto. It’s not that hard.

Is it Science Fiction?

I go to a book store and after looking around I’m forced to ask.

– Excuse me, where’s the science fiction section?

The woman points to the back of the book store, to a poorly lit section, next to the book for kids sector full of toys and little chairs. Well, at least they have a section. From where I’m standing it look like a whole section, it probably has around 500 books. There must be something that I haven’t read.

When I arrive I notice that a whole shelf consist of Lord of the Rings books. I continue scanning and I see a lot of stuff about dragons and vampires. There’s even a copy of Harry Potter left over when it wasn’t popular enough and didn’t deserve the huge tower of books in the middle of the bookstore.

Where is the science in wizards, dragons and magic rings? You know, Science Fiction is called that way for a reason. If I wanted to read fantasy I would have gone to the fantasy section, thank you very much.

This is not the worst. I’ve seen countless top ten science fiction TV shows list that included Buffy and Angel. “Science Fiction” is not a label for weird. I was throwing a huge tantrum about it and my wife, in her infinite understanding said:

– Maybe they don’t know it isn’t science fiction.

How could they not know? It says “science” in the name. But apparently people are not very logical and never think what a name means (and keep calling the United Kingdom England, The Netherlands Holland, and United States of America, well, America, which is a continent, not a country).

I’ve decided to solve this problem once and for all in the geek-programmer way, which is of course, a web site with voting. I created:

Is it Science Fiction?

Of course, if everybody voted we would end up with a mess the world is today, but I hope only geeks will put up with my bad graphical design skills and actually vote and comment so we’ll end up with pretty good results. So far Star Wars is 4th from the bottom, heavily on the not-sci-fi side of things, so I’m pretty sure it’s working. You have to be very hard core to believe Star Wars is not Science Fiction.

My goal is to build the canonical place to point to when the discussion about whether something is or isn’t science fiction starts. You won’t have to explain it yet again why Lord of The Rings is fantasy, not science fiction, just point to http://isitsciencefiction.com/items/the-lord-of-the-rings. If your favorite pet peeve is not there, feel free to add it.

Of course we are only judging whether something is or isn’t science fiction, not whether something is good or bad. Batman is great, but it’s not Science Fiction. Plan 9 From Outer Space sucks, but it is Science Fiction (well, I don’t know, I haven’t seen it yet).

An intelligent music player

mikPmjuI still haven’t found a good music player, for my computer that is. The one that got the closest to it was Amarok, but still it was very far away. My problem is that I don’t know what to listen to, really! I’m only just finding out what music to use  for coding. There’s one thing I really want from a music player: for it learn what to play for me. It’s not the same as learning what I like. It’s much more complex. Amarok learns what I like, but not really what to play for me.

In Amarok, when you jump to the next song it checks how much of the song you listened and assigns a score based on that. For songs that you listen completely you get a high score and for songs you listen only for a couple of seconds a low score. Over time, as you listen, those you like most and listen most will get high scores while those you despise and jump immediately will get a lower score.

Amarok has a special playing list, or used to have in the 1.4 version, which is called “dynamic” and plays those songs with the highest score. That sounds excellent, but it’s not enough. This music player I’d like to have would not compute how much I like a song, like Amarok, but how probable it is that  I’ll like it when it plays that song.

Let’s call this player Pamup, Pablo’s Music Player, and let’s see how it could provide such a magic feat as playing songs that you want to listen (even if you don’t know you want to listen to them).

Pamup would have a scoring for the songs but instead of being a linear score it’ll be multidimensional. Let’s start with two simple dimensions and the rest will be clear: percent of playing time and time of the day. Song A you play 100% and song B 50%. That means that you like song A better than B. That is what Amarok does. Pamup would instead record:

  • Song A in the morning: 100%
  • Song B in the morning: 50%
  • Song A in the evening: 50%
  • Song B in the evening: 100%

You like A as much as B, but you are more likely to want to listen to A in the morning, and B in the evening. Of course adding the time of the day will probably not improve the equation by much. The idea would be to add as many dimensions as possible. Some dimensions may be irrelevant and they should cancel themselves out, like in this case:

  • Song A in the morning: 100%
  • Song B in the morning: 50%
  • Song A in the evening: 100%
  • Song B in the evening: 50%
In that case, you like A better than B, in the evening and in the morning. The time of the day is irrelevant. Maybe it’s only irrelevant for some songs but not for other:
  • Let it be, I like it at all times.
  • O Fortuna of Carmina Burana, please, don’t wake me up with that (or maybe yes, please do, not sure).

Maybe it’s irrelevant for some people, but not for others. I don’t know and we don’t need to know.

I can think of many other dimensions to add to the system and I’m sure many other people will think of more and as technology improves we’ll be able to have even more:

  • What program are you using? I want music that helps me concentrate when I’m using my text editor to write code while I don’t care much about what I’m listen to while web browsing.
  • What are you browsing? Maybe I do care about the music while I’m web browsing. Redditing and Facebooking can be done pretty much with any music, but if I’m at Lambda the Ultimate, I need something to concentrate. Even some analysis of the web site could give some important hints: lot’s of dense text, no pictures, play Mozzart; a photo blog, play whatever.
  • How are you controlling the player? Are you using the keyboard with global shortcuts? you are probably doing something else. Are you using the remote control? you are probably away from the computer. Are you using the mouse directly into the players window with the lyrics window open? Ok, let’s play something with lyrics because you probably feel like reading, maybe even signing.
  • Are you singing? When can find that out using the computer’s microphone. Let’s play things that are in your vocal range, and mostly by the same gender as you are. Let’s also play things you liked singing before.
  • Are you using only one app or switching between various apps?
  • Which apps are you switching with?
  • Is there any other sound coming out from the computer? If so, maybe soothing background music with not much volume is what the player should play.
  • Are you dancing? Let’s disco! You think that’s a tough one? Most smart phones have accelerometers in them, if you have the smart phone on your pocket I’m sure I can find out if you are in the couch or dancing, or maybe moving but not dancing. Even the raw input of the accelerometer could be used as a signal, because it’ll be different depending how tired you are and how you are dancing.
  • Are you alone? You think that’s a hard one as well? Many people are using wifi, so, what’s the strength signal received on other devices on the same network?. If another computer has a similar signal level as yours and it is being used, you probably are not alone. It could also be done using smart phones, although with a smart phone you don’t require to be in use, you require it not to be on the table. If it’s plugged into the computer, you can ignore it, if it’s flat and not moving (accelerometer again), you can ignore it.
  • Who are you with? I hope by now you realize how much we can find out. Let’s make it social, let’s have the app in every device. Why would people install it? Well, when you visit me, if you have it on your device, you’ll device will tell my computer what you like, and my computer knows what I like, so it’ll try to find a common ground for us (and it won’t trust me that much when I skip a song, because maybe it’s you skipping it). We could make you use your own smart phone to skip it, and then Pumap knows who is skipping it.
  • Who are you talking with? If you are talking with other people, using voice recognition you may identify that people, or at least how many there are. If there’s cutlery clater in the background, people are eating, let’s just play background music for a nice evening. If it’s only you speaking, maybe you are in an old land-line phone (if you were using your smart phone, Pumap would know), let’s cut the music altogether, probably it’s distracting.

I believe this program should not work with special cases but have some very sofisticated machine learning system where we input all these signals and does the right thing. And as more signals become available, they are added and analyzed as well. I would like to have that music program! Because honestly, really, I’m not sure what music I want to listen to. I want my computer to figure it out for me.