Posts Tagged web
The iPad’s lack of Flash is a win-win situation
It seems iPad lack of Flash support is the debate of the moment. On one camp: “You can’t use the web without Flash”, on the other camp: “We don’t need no stinking Flash”. Although I do realize how a technology like Flash is sometimes needed, I’m more on the second camp. The less flash, the better.
I think iPad’s lack of Flash cause two things to happen:
- Slow down the adoption of the iPad: surely someone will say “No Flash, No iPad“.
- Speed up the adoption of HTML5: surely someone will consider using HTML5 to support the tablet.
Giving that the iPad is a closed device, probably the closedest non-phone computer consumers ever had access to and that HTML5 is good progress for the web, I consider both results of the iPad not having Flash positive. If I have to say anything about it, it’d be: please, stop trying to wake Steve Jobs up regarding this, you’ll ruin it.
The sad truth about testing web applications
There are many ways to test a web application. In the lowest level, we have unit tests; in the highest levels we have HTTP test, those that use the HTTP protocol to talk to running instance of your application (maybe running it on demand, maybe expecting it to be running on a testing server).
There are several ways to write HTTP tests. Two big families: with and without a web browser. Selenium is a popular way to write tests with a browser. A competing product is Web Driver which I understand can use a browser or other methods. If you’ve never seen Selenium before is pretty impressive. You write a tests that says something like:
- go to http://…
- click here
- click there
- fill field
- fill field
- submit form
- assert response
and when you run it you actually see a Firefox window pop up and perform that sequence amazingly fast. Well, it’s amazingly fast the first three runs, while you still have two tests or less. After that it’s amazingly slow, tedious, flaky and intrusive.
For the other family of tests, without a web browser, aside of Web Driver we have HttpUnit, HtmlUnit and most of the Ruby on Rails testing frameworks. The headless solution tend to be faster and more solid, but the scenarios are not as realistic (only one JavaScript engine, if you are lucky, no rendering issues, like slowdowns, etc).
When you are testing, as soon as you touch the HTTP protocol everything becomes much harder and less useful. If you want to be totally confident a web application is working you need to test at the HTTP level, but the return-of-investment for those tests is very low: they are hard to write and not very useful.
Hard to write
They are hard to write because you are not calling methods with well-defined interfaces (list of arguments) but essentially calling one method HTTP-request, passing different parameters to get different results. You don’t have any code-completion, you don’t have any formal way to know which arguments to pass. Anything can be valid.
In a unit test you may have something like:
add_user("john");
when in a HTTP test you’ll have something like
http.send_request("/user/create", "username=john");
When you are writing a unit test, figure out the name of the add_user function and its arguments is easy. Some IDEs would autocomplete the name and show you the argument list. And if the name of add_user changes, some refactoring tools will even fix your tests for you.
But “/user/create” and “username=john” are strings. To figure them out you’ll have to know how your application handles routing, and how the parameters are passed and parsed. If your application changes from “/user/create” to “/user/add” the test will just break, and most likely, with a not-very-useful error message. Which takes into the next issue…
They are not very useful
They are not very useful because their failures are cryptic. When you write a test that calls method blah, which calls method bleh, which calls method blih, and then bloh and bluh and bluh divides by zero, you get an exception and a stack trace. Something like:
bluh:123: Division by zero! I can't divide by zero (I'm not Haskell) bloh:234: bluh(...) blih:452: bloh(...) bleh:34: blih(...) blah:94: bleh(...) blah_test:754: blah(...)
You know that the test blah_test failed on line 754 when calling blah, which called bleh on line 94, which called blih on line 34, which called bloh on line 452 which called bluh on line 234 which dived by zero on line 123. You jump to bluh, line 123, and you may find something like:
a = i / 0;
where you replace the zero with something else; or most likely:
a = i / j;
where you have to track where j came from. Either it was calculated there or generated from another method and passed as an argument. The stack-trace gives you all the information you need to find where j was generated or where it came from. That’s a very useful test.
When you have HTTP in the middle, tests become much less useful. The stack trace of a failure would look something like:
http_request:123: Time out, server didn't respond. blah_test:45: http_request(...)
That means that blah_test failed on line 45 making an http request call which failed with a timeout. Did your application divide by 0 and crashed? Did it try to calculate pi and it’s still doing it? Did it failed to connect to the database? Where did it actually fail? You don’t know. The only thing you know is that something went wrong. Time to open the log files and figure it out.
You open the log file and you find there’s not enough information there. You make the application log much, much more. So much that you’ll fill a terabyte in an hour. You run the test again and this time it just passes, no errors.
When you are at the HTTP level there are many, many things that are flaky and can go wrong. Let’s invent one example here: the web server you were using for the tests wants to DNS resolve everything it can. Every host name is resolved to the ip, and every ip is reverse-resolved to a name. When you run the test there was a glitch and your name servers were down. Now they are working correctly and they’ll never fail for another year. Good luck figuring it out from a time-out message.
The other way in which HTTP tests fail is something like this:
blah_test:74: Index out of bound for this array
You go to line 74 and it’s something like:
assert_equal("username", data[0]);
If data[0] caused an out-of-bound error, then the array data is empty. How can it be empty? It contains the response from the server and you know the server is responding with something usable because you are using the app right now.
What happened was that the log in box used to have the id, in HTML, "login" and it is now "log-in". That means the HTML parsing methods on blah_test don’t find the log in box and fail to properly fill the array data. Yet another case of tests exposing bugs, in the tests. And the real-life failures are much, much more complex like this.
My recommendation
All this makes the return of investment of writing HTTP tests quite low. They are very hard to write and they provide very little information when they fail. They do provide good information when they pass: if it works at the HTTP level, probably everything else works too.
I’d recommend any project not to write any HTTP test unless every other possible test, unit and integration, is already written.
Redirecting on load
Of all the bad practices I see on the web this ranks as very bad and I believe it’s not mentioned enough. It’ll easily make it to my personal top 5.
I go to a web site, like example.com, and I immediately get redirected to an ugly URL beast, like example.com/news/today?date=2009-06-30&GUID=5584839592719193765662.Wha? Why? First, the site broke any chance I had of making a bookmark of it with just one click. I don’t want to bookmark yesterday’s news (look at the URL, it has a date), and what’s that GUID? Oh well, I go and make the bookmark, pointing to example.com, by hand, because I have no other way.
Even if it only redirected me to example.com/news/today it’d be pretty bad. That URL may not work tomorrow due to changing software. Or what can be even worse: the software and the content get revamped, the URLs changed and everything is cool again, and since the developers are smart people they leave old URLs working. So my bookmark works, but shows obsolete information.
With my crazy browsing habits (open a trillion tabs, fast, fast, faster) I go to a page, leave it loading, and when I go back and see a weird URL I end up wondering whether I accidentally clicked on something or something weird happened. I have to go back and check.
It gets even worse when the URL is rather obscure. My e-banking site has this issue. I go to the bank home page where I can find the e-banking link. I click it and it opens the e-banking page, which sells you the service and in a small corner has a link to the real e-banking application where you can log in and see the big red numbers. I’d say they have a deeper problem than redirecting. They see the bank as a company with its useless propaganda home page and e-banking as a product with its useless propaganda home page and then, the actual e-banking site, somewhere else. They should just have the log in on their home page, like any other on-line service. But I digress.
Back to redirecting. I click log in and it opens, in another window, a web site with a URL that is measured in meters. Long, ugly and scary. I never even thought of bookmarking that because I’m sure it won’t work the second time. So my bookmark is to the previous page. Just today, after a year of using it, I discovered that there’s a nice short well-formed URL for the log in page, something like: bank.com/ebanking/login which immediately redirects to the ugly one. Thanks to the amazing speeds of Switzerland internet connection and today’s browsers I never noticed.
If the bank had just been serving the content through that URL, they would have saved more time over a year than it took me to write this post. Literally. I can’t understand why they don’t do it properly. If they are passing session information, they should use session state on the server side and a cookie. If they have a modular structure where the app is located elsewhere, instead of redirecting you they should use a reverse proxy. It takes a day to configure Apache for such a thing if you don’t know what you are doing.
I’ve been using it for ages to serve Plone sites that are in a subdirectory in a Zope web server which runs in an alternate port, yet the front end is Apache and you are never redirected anywhere. You go to example.com which hits my Apache server and inside makes a request to zope.example.com:8080/example.com and serves you the result, you never leave example.com. Even if you go to the secure version, the SSL part is handled by Apache since Zope is not that good (or wasn’t) at it.
There are cases to redirect someone on a web site. When the content is no longer available or temporarily unavailable. When the user just submitted a form, you redirect if the form was successfully processed to another page that shows the result of the form (the record created or whatever). There are many reasons to do that but that’s for another post.
There’s no reason to redirect on load. Please, don’t do it.
Reviewed by Daniel Magliola. Thank you! Use Other Door picture by cobalt123.
Proper linking ettiquete
This has been mentioned thousands of times on the interwebs, but in case there’s at least one person reading this that didn’t know it, I’m explaining it again. Using hyperlinks in a piece of text doesn’t mean it has to stop being proper, readable English (or any other language). For example, imagine the phrase:
It was a nice movie, click here to read more about it.
Read it again. Now close your eyes and imagine someone reading it out loud. It doesn’t make any sense, does it?
Hyperlinks already carry the meaning that there’s more information behind them. No need to repeat it with “to read about it”. And they also carry the information about being clicked, so no need to say “click here”. And in some interfaces you don’t click, and I can think of already two cases:
- People using the keyboard and only the keyboard to navigate. They are more than you think. I myself would be doing it much more if it wasn’t so hard on so many broken web sites.
- People using a phone, like the iPhone. You don’t click, nothing clicks. It’s called tapping.
For computers “click here” doesn’t provide any proper meta-data. There are services that extract a lot of information about links. Google being one example. Let’s analyze what would happen to Google if you do it correctly, like:
It was a nice movie.
That was short, wasn’t it? Half the size and no-nonsense, but I digress. Google would index that link as a “nice movie” and that’s good because you are adding information to the web, you are expressing your opinion and when people search for “nice movie” they are more likely to find the movie you pointed to. Maybe you are the only one believing that’s a nice movie, but when lots of people link to it as a “nice movie”, Google will catch that.
Also, imagine that your page gets turned into plain text, or printed, or spoken, or whatever:
- It was a nice movie, click here to read more about it.
- It was a nice movie.
Which one makes more sense?
Now, we can take it a step further. Something else you can do to make your text more readable, more robust and nicer overall is to do more or less proper attribution. I’m not talking about academic proper attribution, I’m taking about simple things. I’ve recently found this sentence in the Stack Overflow article Advice for Computer Science College Students:
I’ve read an article from Joelonsoftware.com a few years agohttp://www.joelonsoftware.com/articles/CollegeAdvice.html
which I promptly edited, thanks to my karma earnings, to be:
I’ve read the article Advice for Computer Science College Students from Joel on Software a few years ago.
Aside from the proper period at the end of a sentence, do you see how and why my version is more readable, contains much more information (while being shorter on text on the screen) and can resist being turned into text, speech or braille? So, next time you write something, please, remember that even if you are using a computer, you are still writing a proper language.
Sometimes the links are so important that you want them to get to a text or spoken version. In that case, imagine how you would write it if you were speaking or writing with a pen on paper:
I really like Joel on Software, which you can read on http://joelonsoftware.com.
which you can then later enhance for the web:
I really like Joel on Software, which you can read on http://joelonsoftware.com.
Now there’s extra information in there. The URL is there three times, one in text, two in hyperlinks. But the text is not longer and it’s not harder to read (unless you pick up hyperlink colors badly) and it gives the user more places to link, machines that look for context information more to pick up from. It’s a win-win.
Reviewed by Daniel Magliola. Thank you!
Pylons or Django?
I am trying to decide whether to use Pylons or Django. Both are frameworks for building Python web applications, but with opposing philosophies.
Django tries to be everything. It comes with its own ORM, its own template engine, its own everything. That gives you a nice developing experience because everything fits together and because very nice applications can be built on top of all those components, like the admin tool, which is amazing. Read the rest of this entry »

Recent Comments