A mass of stupid benchmarks. You have been warned.

* * * * *

A mass of stupid benchmarks. You have been warned.

A few weeks ago Wlofie [1] and I were talking about computer languages and I
mentioned the language I wrote in college [2] (it was more than the simple
DSL (Domain Specific Language) shown there), so I cleaned up the code so it
would compile (most of it was written in 1993; last date of modification was
March 5^th, 1995) and just because I was curious, I decided to run a stupid
benchmark on it.

What's more stupid than just adding up the first 1,000,000 integers?

> : main
> 0 0
> begin
> swap over + swap 1 +
> dup 1000000 < while
> repeat
> drop "%d#n" string print ;
>
> main 0 exit
>

(Oh, did I mention the language was based off Forth [3]?)

Running that code on my modern 2.6GHz (Gigahertz) machine takes a full 2.1
seconds (and that's after I removed a huge bottle neck in the runtime
engine). The equivalent C code:

> #include <stdio.h>
> #include <stdlib.h>
>
> int main(int argc,char *argv[])
> {
> int i;
> int total;
>
> for (i = 0 , total = 0 ; i < 1000000 ; i++)
> total += i;
>
> printf("%d\n",total);
> return EXIT_SUCCESS;
> }
>
>

takes a mere 0.004 seconds to run.

Okay, that's expected—it's C. But what about Perl? 0.47 seconds.

Heh. A hundred times slower.

Hey! What about Python? It's not as fast as Perl, but it comes in at around
0.58 seconds.

I'm not even going to attempt a comparrison between my language and Ruby [4]—
it'd be too depressing to lose out to a known slow language.

But Wlofie did mention one other language he was playing around with—Lua [5].
Okay, how fast is Lua? Same benchmark as above: 0.15 seconds.

Wow.

It blew Perl right out of the water.

This required a bit more investigation. And a better test than just summing
up the first 1,000,000 integers. I settled upon the Jumble program [6]. It
involves I/O (Input/Output) and has one non-trivial operation (sorting the
letters in a word). And on my current system, the C version can check 483,523
words in 0.18 seconds (the timings I gave back in 2008 were for my older,
120MHz (Megahertz) development machine).

The Lua code is straightforward:

> #!/usr/local/bin/lua
>
> WORDS = "/usr/share/dict/words"
>
> -- ******************************************************
>
> function sortword(word)
> local t = {}
> for i=1,string.len(word) do t[i] = string.byte(word,i) end
> table.sort(t)
> return string.char(unpack(t))
> end
>
> -- ********************************************************
>
> if #arg < 1 then
> io.stderr:write("usage: %s word\n",arg[0])
> os.exit(1)
> end
>
> word = sortword(string.upper(arg[1]))
>
> dict = io.open(WORDS,"r")
> if dict == nil then
> io.stderr:write("Houston, we have a problem ... \n")
> os.exit(1)
> end
>
> for line in dict:lines() do
> dictsort = sortword(string.upper(line))
> if dictsort == word then
> print(line)
> end
> end
>
> dict:close()
> os.exit(0)
>
>

sortword() has quite a bit of work to do—it has to go through the string,
placing each letter into an array, sort the array, and then convert the array
back into a string. Unlike C, which treats strings as an array of characters
(and thus can be sorted without the data gymnastics), Lua (and pretty much
all other modern scripting languages) treat strings at a single unit.

So it takes 7.9 seconds for Lua to run through the same 483,523 words.

Ouch. And I think it's pretty obvious where the time is being spent—in
sortword(), pulling apart strings, sorting arrays, making new strings. But
Lua can be easily extended using C. I rewrite sortword() in C:

> #include <stdlib.h>
> #include <string.h>
>
> #include <lua.h>
> #include <lauxlib.h>
> #include <lualib.h>
>
> /*******************************************/
>
> static int cmp(const void *l,const void *r)
> {
> int cl;
> int cr;
>
> cl = *(const char *)l;
> cr = *(const char *)r;
>
> return cl - cr;
> }
>
> /************************************/
>
> static int sortword(lua_State *L)
> {
> const char *word;
> size_t len;
>
> word = luaL_checkstring(L,1);
> len = strlen(word);
>
> char buffer[len];
> memcpy(buffer,word,len);
> buffer[len] = '\0';
> qsort(buffer,len,1,cmp);
> lua_pushlstring(L,buffer,len);
> return 1;
> }
>
> /***********************************/
>
> int luaopen_sortword(lua_State *L)
> {
> lua_register(L,"sortword",sortword);
> return 0;
> }
>

And now it only takes 2.2 seconds. Not a bad return upon investment.

Now we turn to Perl:

> #!/usr/bin/perl
> use strict;
>
> #***************************
>
> sub sortword
> {
> my $str = shift;
> my @t = unpack("(C)*",$str);
>
> @t = sort(@t);
> return pack("(C)*",@t);
> }
>
> #*******************************
>
> if (scalar(@ARGV) == 0)
> {
> printf(STDERR "usage: %s word\n",$0);
> exit(1);
> }
>
> my $word = sortword(uc($ARGV[0]));
>
> open(DICT,"/usr/share/dict/words")
> or die("Houston, we have a problem ... \n");
>
> while(my $line = <DICT>)
> {
> chop $line;
>
> my $dictsort = sortword(uc($line));
>
> if ($dictsort eq $word)
> {
> print $line . "\n";
> }
> }
>
> close(DICT);
> exit(0);
>

Again, we need to pull apart the string into an array, sort that, and
recreate the string. And when we run this lovely bit of Perl code it only
takes 10.5 seconds.

Wow.

And here I thought Perl was the text crunching Master of the Universe.

Guess not.

Well, the Perl version of sortword() could be tightened up a bit …

> sub sortword
> {
> return pack("(C)*",sort(unpack("(C)*",shift)));
> }
>

That takes us down to 8.6 seconds—nearly two full seconds to shuffle data
between variables. Perl isn't looking all that good. Well, there is the
function overhead, and given that we got sortword() down to one line, maybe
if we inline it we'll get an improvement.

And we do—it's not 7.2 seconds. We finally beat unoptimized Lua code, but now
with a more cryptic Perl version.

Way to go, Perl!

But given that Lua was easy to extend with C, can we do the same with Perl? A
bit of searching lead me to the Inline::C Perl Module [7]. Ten minutes to
download and install (much better than I expected) and if anything, it was
easier to embed C in Perl than in Lua:

> #!/usr/bin/perl
> use Inline C;
> use strict;
>
> # same code as above, minus the sortword() subroutine
>
> exit(0);
>
> __END__
> __C__
>
> static int cmp(const void *l,const void *r)
> {
> int cl;
> int cr;
>
> cl = *(const char *)l;
> cr = *(const char *)r;
>
> return cl - cr;
> }
>
> char *sortword(char *word)
> {
> size_t len;
>
> len = strlen(word);
> qsort(word,len,1,cmp);
> return word;
> }
>

And we're down to 2.1 seconds. Just a bit quicker than the Lua/C version.

Well, that's after the 4.1 second initial run where it dumped a bunch of
directories and files (lovely that).

Lua appears to be a bit faster than Perl, or at the very least, as fast as
Perl.

And everything (even, quite possibly, Ruby) is faster than my own homebrewed
language … sigh.

[1] http://wlofie.dyndns.org/
[2] gopher://gopher.conman.org/0Phlog:2007/01/23.1
[3] http://forth.org/
[4] http://www.theregister.co.uk/2008/04/21/bray_ruby_rails/
[5] http://www.lua.org/
[6] gopher://gopher.conman.org/0Phlog:2008/06/27.1
[7] http://search.cpan.org/~sisyphus/Inline-0.45/C/C.pod

Email author at [email protected]