* * * * *
About those eight seconds
In my previous post [1], I mentioned a program that took a year to run twenty
some years ago on a then state-of-the-art workstation could now be done in 17
minutes on a two year old laptop.
What I didn't mention is that the program twenty-some years ago was written
in C and the program I ran today was written in Lua [2].
Yes, computers are so fast these days that a scripting language can out-
perform computers from two decades ago [3].
“Okay,” you say. “But I won't want to wait 17 minutes for my data.”
Okay, fine. I see two options, and let's try the first option and one that
most people would do—drop down to C. And yes, that does give us an
improvement, an impressive improvement—only 2.5 seconds per frame, and across
four cores that means you'll have the results in a little over five minutes.
Not that bad.
The other option, and hear me out—is to take our Lua code and run it via
LuaJIT [4], a (pretty much) drop in replacement for Lua that compiles down to
native code. Even if it's a bit slower than C, it should still be faster than
Lua with no code changes.
So how does LuaJIT fare?
Personally, I was expecting the C version (which I actually wrote first) to
be faster, if only buy a little bit, but …
So, here's the C version (which generates a single image):
> [spc]saltmine:~/source/play>time ./a.out >/dev/null
>
> real 0m2.483s
> user 0m2.470s
> sys 0m0.000s
>
And now the LuaJIT version:
> [spc]saltmine:~/source/play>time luajit amap.lua >/dev/null
>
> real 0m0.849s
> user 0m0.840s
> sys 0m0.000s
>
[Yeah, that's what I did when I saw these results] [5]
And no, that's not a mistake (belive me, I checked and rechecked)—here:
> [spc]saltmine:~/source/play>time lua amap.lua >/dev/null
>
> real 0m8.091s
> user 0m8.060s
> sys 0m0.000s
> [spc]saltmine:~/source/play>time luajit amap.lua >/dev/null
>
> real 0m0.856s
> user 0m0.850s
> sys 0m0.000s
>
So … um … I can have that data to you in two minutes? Is that fast enough?
On reflection, it makes sense that LuaJIT will outperform C in this case.
It's heavily CPU (Central Processing Unit) bound and the fact that the main
function:
> function mainloop(A,B,C,D)
> local pix = {}
> local xn = 0.5
> local yn = 0.5
>
> for count = 1 , MAX do
> local xn1 = ((A * yn) + B) * xn * (1.0 - xn)
> local yn1 = ((C * xn) + D) * yn * (1.0 - yn)
>
> xn = xn1
> yn = yn1
>
> if xn < 0 then return MAX-1 end
> if xn >= 1 then return MAX-1 end
> if yn < 0 then return MAX-1 end
> if yn >= 1 then return MAX-1 end
>
> local ix = math.floor(xn * DIM)
> local iy = math.floor(yn * DIM)
> local f = iy * DIM + ix -- Lua doesn't really do N-dimensional arrays
>
> if pix[f] then return count-1 end
> pix[f] = true
> end
> end
>
can be recompiled per call to take advantage of the paramters. For instance,
when A and B are both 0, the first expression then becomes:
> local xn1 = xn * (1.0 - xn)
>
and given that I'm doing 32,768 interations of this (oh, did I fail to
mention that? Yes, I'm doing 27,768 more interations than the code did twenty
some years ago) this does save quite a bit of time.
[DELETED-Update on August 5^th, 2013
Oops, I made a slight mistake [6] …
-DELETED]
Update on Tuesday, August 6^th, 2013
Mistakes were made alright [7].
[1]
gopher://gopher.conman.org/0Phlog:2013/08/04.1
[2]
http://www.lua.org/
[3]
http://prog21.dadgum.com/52.html
[4]
http://luajit.org/
[5]
gopher://gopher.conman.org/IPhlog:2013/08/04/jawdrop.jpg
[6]
gopher://gopher.conman.org/0Phlog:2013/08/05.1
[7]
gopher://gopher.conman.org/0Phlog:2013/08/06.1
Email author at
[email protected]