Tag: data driven tests

Data driven tests

I’m not sure if anybody uses the terminology “data driven test” but if you explain what it is, experienced people will tel you that they are bad. Data driven tests are tests with the same code repeating over many different pieces of data.

Let’s show an example. For my startup project Keep on Posting, I have a method that turns a blog url into a feed url. That method is critical for my application and there are many things that can go wrong, so I test it by querying a sample of real blogs. The test would be something like this (this is in Ruby):

class BlogToolsTest
  BLOGS_AND_FEES =>
      "http://blog.sandrafernandez.eu" => "http://blog.sandrafernandez.eu/feed/",
      "http://www.lejanooriente.com" => "http://www.lejanooriente.com/feed/",
      "http://pupeno.com" => "https://pupeno.com/feed/",
      "http://www.avc.com/a_vc" => "http://feeds.feedburner.com/avc",
  }

  def test_blog_to_feed_url
    BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
      assert_true feed_url == BlogTools.blog_to_feed(blog_url)
    end
  end
end

Note: I’m using assert_true instead of assert_equal to make a point; these kind of tests tend to user assert_true.

The problem with that is that eventually it’ll fail and it’ll say something like:

false is not true

Oh! so useful. Let’s see at least where the error is happening… and obviously it’ll point to this line:

      assert_true feed_url == BlogTools.blog_to_feed(blog_url)

which is almost as useless as the failure message. That’s the problem with data drive tests. You might be tempted to do this an re-run the tests:

  def test_blog_to_feed_url
    BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
      puts blog_url
      puts feed_url
      assert_true feed_url == BlogTools.blog_to_feed(blog_url)
    end
  end

but if your tests take hours to run, like the ones I often found while working at Google, then you are wasting time. Writing good error messages ahead of time help:

  def test_blog_to_feed_url
    BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
      assert_true feed_url == BlogTools.blog_to_feed(blog_url), "#{blog_url} should have returned the feed #{feed_url}"
    end
  end

and if half your cases fail and the whole suit takes an hour to run and you have 1000 data sets you’ll spend hours staring at your monitor fixing one test every now and then, because as soon as one case fails, the execution of the tests is halted. If you are coding in a language like Java, that’s as far as you can take it.

With Ruby you can push the boundaries and write it this way (thanks to executable class bodies):

class BlogToolsTest
  BLOGS_AND_FEES =>
      "http://blog.sandrafernandez.eu" => "http://blog.sandrafernandez.eu/feed/",
      "http://www.lejanooriente.com" => "http://www.lejanooriente.com/feed/",
      "http://pupeno.com" => "https://pupeno.com/feed/",
      "http://www.avc.com/a_vc" => "http://feeds.feedburner.com/avc",
  }

  BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
    define_method "test_#{blog_url}_#{feed_url}" do
      assert_true feed_url == BlogTools.blog_to_feed(blog_url), "#{blog_url} should have returned the feed #{feed_url}"
    end
  end
end

That will generate one method per item of data, even if one fails, the rest will be executed as they are separate isolated tests. They will also be executed in a potential random order so you don’t have tests depending on tests and even if you don’t get a nice error message, you’ll know which piece of data is the problematic through the method name.

Note: that actually doesn’t work because blog_url and feed_url have characters that are not valid method names, they should be replaced, but I wanted to keep the example concise.

Since I’m using shoulda, my resulting code looks like this:

class BlogToolsTest
  BLOGS_AND_FEES =>
      "http://blog.sandrafernandez.eu" => "http://blog.sandrafernandez.eu/feed/",
      "http://www.lejanooriente.com" => "http://www.lejanooriente.com/feed/",
      "http://pupeno.com" => "https://pupeno.com/feed/",
      "http://www.avc.com/a_vc" => "http://feeds.feedburner.com/avc",
  }

  BLOGS_AND_FEEDS.each_pair do |blog_url, feed_url|
    should "turn blog #{blog_url} into feed #{feed_url}" do
      assert_equal feed_url, BlogTools.blog_to_feed(blog_url), "#{blog_url} did not resolve to the feed #{feed_url}"
    end
  end
end

and running them in RubyMine looks like this: