Pablo's blog

A bit of this, a bit of that and a lot about computers

Archive for the category “Technical”

Why I love Smalltalk

Smalltalk logo

This post was extracted from a small talk I gave at Simplificator, where I work, titled “Why I love Smalltalk and Lisp”. There should be another post, “Why I love Lisp” following this one.

After I learned my basic coding skill in more or less traditional languages, like C, C++, Python, there were four languages that really taught me something new. Those languages changed my way of thinking and even if I never use them, they were worth learning. They are:

  • Smalltalk
  • Lisp
  • Erlang
  • Haskell

You can probably add Prolog to that list, but I never learned Prolog. This post is about Smalltalk.

My goal is not to teach Smalltalk but to show things that you can do with Smalltalk that you can’t do with any other language (disclaimer: surely other languages can do it, and we’ll call them Smalltalk dialects). Nevertheless I need to show you some basics of the language to be able to show you the good stuff, so here we go, a first program:

1 + 1

That of course, evaluates to 2. If we want to store it in a variable:

m := 1 + 1

Statements are finished by a period, like this:

m := 1.
m := m + 1

In Squeak, a Smalltalk implementation, there’s an object called Transcript and you can send messages to it to be displayed on the screen. It’s more or less like a log window. It works like this:

Transcript show: 'Hello world'

and it looks like this:

Squeak transcript showing the result of Transcript show: 'Hello World'

The syntax is quite unique to Smalltalk. The message, otherwise known as “method call” in other languages, is called show: (including the colon) and it takes an argument. We can run it 10 times in a row with the following snippet:

10 timesRepeat: [
  Transcript show: 'Hello world'
]

There you can start to see how Smalltalk is special. I’m sending the message timesRepeat: to the object 10, an Integer. Doing something N times repeatedly is handled by the Integer class, which if you think about it, makes sense.

The second interesting part, is the block. The part inside squared brackets. You might thing that’s the equivalent of other language’s block syntax, like in this Java example:

for(int i=1; i<11; i++) {
  System.out.println("Hello world");
}

but Smalltalk version’s is a bit more powerful. It’s a real closure. Look at this:

t := [
  Transcript show: 'Hello world'
]

Now I have a variable named t, of type BlockClosure, and I can do anything I want with that variable. If I send it the class message it’ll return its class:

t class

and if I sed it the value message, it’ll execute and leave a “Hello World” in the transcript:

t value

Let’s see some more code. A message without any arguments:

10 printString

a message with one argument:

10 printStringBase: 2

and a message with two arguments:

10 printStringBase: 2 nDigits: 10

Isn’t it cute? That method is called printStringBase:nDigits:. I never seen that syntax anywhere else; well, except in Objective-C, which copied it from Smalltalk.

Enough toying around, let’s start building serious stuff. Let’s create a class:

Object subclass: #MyClass
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: 'Pupeno'

Notice that a class is created by sending a message to another class telling it to subclass itself with the name and a few other arguments. It’s a message, a method call like any other. Object is a class, classes are objects. The object model of Smalltalk is a beauty but that’s a subject for another post.

Now that we have a class, let’s create a method called greet: in that class.

greet: name
  "Greets the user named name"

  | message |

  message := 'Hello ', name.
  Transcript show: message.

In that method definition first we have a comment for the method, then the list of local variables within pipes (“|”), and then the implementation, which sets the variable message to contain “Hello ” and the comma concatenates name to it. Then we just send it to the transcript.

It looks like this:

MyClass greet method

Ok, let’s use it:

m := MyClass new.
m greet: 'Pupeno'

To create an object of class MyClass, we send the new message to that class. There’s no new keyword like in Java. new is just a method. You can read its code, override it, etc. Don’t mess with it unless you really know what you are doing.

Actually, if you think about it, we haven’t seen a single keyword. Look all the code we wrote without having to memorize any keywords! What’s even more important is that by now you essentially know Smalltalk. That’s all there is, but like LEGO bricks, this simple and small building blocks allow you to build whatever you want.

Yes, that’s it, that’s all there is to it. We already saw that Smalltalk doesn’t need loops, it has integers and that class implements the timesRepeat: message which allows you to do something N times. There are many other looping methods here and there.

What about the if keyword you ask? Surely Smalltalk has an if? Well, no, it doesn’t. What you can recognize as an if is actually implemented in Smalltalk using the same mechanism of classes and message passing you saw already. Just for fun let’s re-implement it.

We starte by creating the class PBoolean and then two classes inheriting from it, PTrue and PFalse.

Object subclass: #PBoolean
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: 'Pupeno'

PBoolean subclass: #PTrue
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: 'Pupeno'

PBoolean subclass: #PFalse
       instanceVariableNames: ''
       classVariableNames: ''
       poolDictionaries: ''
       category: 'Pupeno'

For the class we created before, MyClass, we define a equals: method that will return either true or false, or rather, PTrue or PFalse.

equals: other
  ^ PTrue new

The little hat, ^, means return. For now, just a hardcoded true. Now we can do this in the workspace:

m1 := MyClass new.
m2 := MyClass new.
m1 equals: m2

and get true, that is PTrue, as a result. We are getting close but no if yet. How should if look like? It’ll look like something like this:

m1 := MyClass new.
m2 := MyClass new.
(m1 equals: m2) ifTrue: [
  Transcript show: 'They are equal'; cr
] else: [
  Transcript show: 'They are false'; cr
]

and you can start to imagine how to implement it. In PTrue we add the method:

ifTrue: do else: notdo
  ^ do value

That method basically takes two parameters, evaluates the first one and ignores the second one. For PFalse we create the oposite:

ifTrue: notdo else: do
  ^ do value

and that’s it. A working if! If you ask me, I think this is truly amazing. And if you check Squeak itself, you’ll find the if is actually implemented this way:

True's ifTrue:ifFalse:

If your programming language allows you to create something as basic as the if conditional, then it allows you to create anything you want.

Getting rid of RubyGems deprecation warnings

A recent update to RubyGems is causing a lot of deprecation warnings like these:

NOTE: Gem::Specification#default_executable= is deprecated with no replacement. It will be removed on or after 2011-10-01.
Gem::Specification#default_executable= called from /usr/lib/ruby/gems/1.8/specifications/rubygems-update-1.4.1.gemspec:11.
NOTE: Gem::Specification#default_executable= is deprecated with no replacement. It will be removed on or after 2011-10-01.
Gem::Specification#default_executable= called from /usr/lib/ruby/gems/1.8/specifications/bundler-1.0.7.gemspec:10.
NOTE: Gem::Specification#default_executable= is deprecated with no replacement. It will be removed on or after 2011-10-01.
Gem::Specification#default_executable= called from /usr/lib/ruby/gems/1.8/specifications/file-tail-1.0.5.gemspec:10.

I generally like software to move forward and the way to do that is deprecate and then after a while, make backwards incompatible changes. It’s painful but there’s no other way.

I do have a problem with all the cron jobs of my web apps like Keep on Posting or DNSk9 flooding my inbox with those warnings. Thankfully, that’s not hard to fix. Where I was doing:

rake pre_calculate_thingies > /dev/null

now I’ll be doing:

rake pre_calculate_thingies 2>&1 >/dev/null | grep -v default_executable

Git-gc'ing all git repositories

I was running out of storage space on my machine, so I started to search for things to remove using Grand Perspective. Some of the big files were inside Git repositories, or rather, inside the .git directory of those repositories. I decided it was time to run the Git garbage collector on them, all of them.

I wrote this little script:

#!/usr/bin/env bash

echo "Gitgcing $1"
cd "$1"

and with this line:

find . -name ".git" -type d -exec gitgc "{}" ";"

run in my home directory, I got all my repos gc’ed.

Careful with that email

When you are building systems like my Keep on Posting or my DNSk9 that send emails there’s always the danger that you’ll accidentally fire emails from your development machine to real users. You really don’t want to do that because it’s annoying and extremely unprofessional.

It happened to me a couple of times. Thankfully, nothing serious. But I learned the lesson. That’s why in my user models now I have a safe_email method which I use instead of accessing email whenever I’m about to actually deliver a message.

The method safe_email ensures that nobody will receive a message unless I’m in production and at the same time it’s good for testing. Obviously most of the time in development and testing mode I don’t deliver emails at all, but sometimes, I make an exception:

def safe_email
  if Rails.env.production? || email.blank? # If the email is blank (or nil), let it be.
    email
  else
    "pupeno+#{email.gsub("@", "_AT_")}@pupeno.com"
  end
end

See the status of your trackings whenever you want

In Keep on Posting you can now see the status of your trackings whenever you want (we still send the email alerts of course).

If you hover your mouse pointer over a startus you’ll get a tooltip with more details:

Hiding suspended blogs and labels

If you have, manage or keep track of a ton of blogs and twitter accounts like I do, I just implemented a couple of features in Keep on Posting that you’ll love: hide suspended blogs and hide link labels.

Here’s a screenshot with everything being shown:

and here’s another with everything hidden:

Data driven tests

I’m not sure if anybody uses the terminology “data driven test” but if you explain what it is, experienced people will tel you that they are bad. Data driven tests are tests with the same code repeating over many different pieces of data.

Let’s show an example. For my startup project Keep on Posting, I have a method that turns a blog url into a feed url. That method is critical for my application and there are many things that can go wrong, so I test it by querying a sample of real blogs. The test would be something like this (this is in Ruby):

class BlogToolsTest
  BLOGS_AND_FEES =>
      "http://blog.sandrafernandez.eu" => "http://blog.sandrafernandez.eu/feed/",
      "http://www.lejanooriente.com" => "http://www.lejanooriente.com/feed/",
      "http://pupeno.com" => "http://pupeno.com/feed/",
      "http://www.avc.com/a_vc" => "http://feeds.feedburner.com/avc",
  }

  def test_blog_to_feed_url
    BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
      assert_true feed_url == BlogTools.blog_to_feed(blog_url)
    end
  end
end

Note: I’m using assert_true instead of assert_equal to make a point; these kind of tests tend to user assert_true.

The problem with that is that eventually it’ll fail and it’ll say something like:

false is not true

Oh! so useful. Let’s see at least where the error is happening… and obviously it’ll point to this line:

      assert_true feed_url == BlogTools.blog_to_feed(blog_url)

which is almost as useless as the failure message. That’s the problem with data drive tests. You might be tempted to do this an re-run the tests:

  def test_blog_to_feed_url
    BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
      puts blog_url
      puts feed_url
      assert_true feed_url == BlogTools.blog_to_feed(blog_url)
    end
  end

but if your tests take hours to run, like the ones I often found while working at Google, then you are wasting time. Writing good error messages ahead of time help:

  def test_blog_to_feed_url
    BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
      assert_true feed_url == BlogTools.blog_to_feed(blog_url), "#{blog_url} should have returned the feed #{feed_url}"
    end
  end

and if half your cases fail and the whole suit takes an hour to run and you have 1000 data sets you’ll spend hours staring at your monitor fixing one test every now and then, because as soon as one case fails, the execution of the tests is halted. If you are coding in a language like Java, that’s as far as you can take it.

With Ruby you can push the boundaries and write it this way (thanks to executable class bodies):

class BlogToolsTest
  BLOGS_AND_FEES =>
      "http://blog.sandrafernandez.eu" => "http://blog.sandrafernandez.eu/feed/",
      "http://www.lejanooriente.com" => "http://www.lejanooriente.com/feed/",
      "http://pupeno.com" => "http://pupeno.com/feed/",
      "http://www.avc.com/a_vc" => "http://feeds.feedburner.com/avc",
  }

  BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
    define_method "test_#{blog_url}_#{feed_url}" do
      assert_true feed_url == BlogTools.blog_to_feed(blog_url), "#{blog_url} should have returned the feed #{feed_url}"
    end
  end
end

That will generate one method per item of data, even if one fails, the rest will be executed as they are separate isolated tests. They will also be executed in a potential random order so you don’t have tests depending on tests and even if you don’t get a nice error message, you’ll know which piece of data is the problematic through the method name.

Note: that actually doesn’t work because blog_url and feed_url have characters that are not valid method names, they should be replaced, but I wanted to keep the example concise.

Since I’m using shoulda, my resulting code looks like this:

class BlogToolsTest
  BLOGS_AND_FEES =>
      "http://blog.sandrafernandez.eu" => "http://blog.sandrafernandez.eu/feed/",
      "http://www.lejanooriente.com" => "http://www.lejanooriente.com/feed/",
      "http://pupeno.com" => "http://pupeno.com/feed/",
      "http://www.avc.com/a_vc" => "http://feeds.feedburner.com/avc",
  }

  BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
    should "turn blog #{blog_url} into feed #{feed_url}" do
      assert_equal feed_url, BlogTools.blog_to_feed(blog_url), "#{blog_url} did not resolve to the feed #{feed_url}"
    end
  end
end

and running them in RubyMine looks like this:

Rake tasks for production

When I need to run something periodically on production, I always implement it as a rake tasks and install it as a cron job. Nevertheless there’s some setup to do in the task to have proper logging and error reporting.

This is the template I use for creating those tasks:

namespace :projectx do
  desc "Do something"
  task :something => :environment do
    if Rails.env.development?
      # Log to stdout.
      logger = Logger.new(STDOUT)
      logger.level = Logger::INFO # DEBUG to see queries
      ActiveRecord::Base.logger = logger
      ActionMailer::Base.logger = logger
      ActionController::Base.logger = logger
    else
      logger = ActiveRecord::Base.logger
    end

    begin
      logger.info "Doing something"
    rescue Exception => e
      HoptoadNotifier.notify(e)
      raise e
    end
  end
end

While in development mode, it outputs to the console for convenience.

Another useful collection method? Enumerable#select_first

For a personal project I’m working on, I need to find out the smallest time period with more than 5 records. I essentially wrote this code:

period = [1.week, 1.month, 1.year].select_first do |period|
  Record.where("published_at >= ?", period.ago).count >= 5
end

only to find out that the select_first method doesn’t exist. So I wrote it:

module Enumerable
  def select_first(&predicate)
    self.each do |item|
      if yield(item)
        return item
      end
    end
    return nil
  end
end

and then of course, I tested it:

require "test_helper"

require "enumerable_extensions"

class EnumerableTest  2 }
  end

  should "select_first the first one" do
    assert_equal 1, [1, 2, 3, 4].select_first { |i| i >= 1 }
  end

  should "select_first the last one" do
    assert_equal 4, [1, 2, 3, 4].select_first { |i| i >= 4 }
  end

  should "select_first none" do
    assert_equal nil, [1, 2, 3, 4].select_first { |i| i >= 100 }
  end
end

A hash map method that returns a hash

I’ve just released another gem, this one extends Hash to contain another method called hmap. This solves a problem I face ofter: how to run a map in a hash that returns another hash, for example:

{:a => 1, :b => 2, :c => 3}

being converted into

{:a => 2, :b => 3, :c => 4}

With hmap it’s easy:

hash.hmap { |a,b| {a => b + 1} }

It also works with arrays, but you must make sure the array you return always contains two and only two elements:

hash.hmap { |a,b| [a, b + 1] }

And that’s all, quite a simple piece of code, but now it’s re-usable and well tested.

Post Navigation

Follow

Get every new post delivered to your Inbox.

Join 333 other followers