I wanted to do something just a tiny bit more serious for a change. It has
been a while. Most of the world seems to conspire to keep me on paltry
affairs...!
One approach to linear algebra is to build up from a collection of fortran77
subroutines named BLAS. The iconic higher level software is LAPACK (linear
algebra pack). But I want to try something else.
Now this might be dubious, but one thing I want to try is using cblas - a
way to use legacy BLAS from C - from Embeddable Common Lisp, a common lisp
compiler that compiles to C and hence has good C interoperability. The gist
is if you imagine
```ksh
pkg_add ecl cblas
```
and then for example if we try
```lblas.ecl
(ffi:clines "
#include <stdlib.h>
#include <stdio.h>
#include <cblas.h>
")
(defun 3-3-dgemv (alpha a x beta y
&aux (m 3) (n 3) (lda 3)) "
cblas_dgemv( .. );
currently makes a lot of assumptions (incx=1 etc
"
(ffi:c-inline (m n alpha beta lda a x y)
(:int :int :double :double :int
:pointer-void
:pointer-void
:pointer-void)
nil "
enum CBLAS_ORDER order;
enum CBLAS_TRANSPOSE transa;
cblas_dgemv( order, transa, m, n, alpha,
a, lda, x, incx, beta, y, incy );
@(return 0) = #7;
"))
(defun 3-dgemv (alpha va vx beta vy)
(ffi:with-foreign-object (a '(:array :double 9))
(ffi:with-foreign-object (x '(:array :double 3))
(ffi:with-foreign-object (y '(:array :double 3))
(loop for v across va for c from 0 to 8 do
(setf (ffi:deref-array a '(:array :double) c) v))
(loop for v across vx for c from 0 to 2 do
(setf (ffi:deref-array x '(:array :double) c) v))
(loop for v across vy for c from 0 to 2 do
(setf (ffi:deref-array y '(:array :double) c) v))
(3-3-dgemv alpha a x beta y)
(loop for c below 3 do
(setf (aref vy c)
(ffi:deref-array y '(:array :double) c)))
(values vy)))))
(c:build-program "lblas-test" :lisp-files '("lblas.o" ))
```
And then kinda like
```ksh
chmod +x build.ecl
/build.ecl
/lblas-test
```
```Output
real time : 0.577 secs
run time : 0.590 secs
gc count : 3 times
consed : 10085808 bytes
#(4.100000023841858d0 2.600000023841858d0 2.600000023841858d0)
```
which at least seems to have done the right loop* slowly in a
double precision float sort of way. I think the slowness is
overhead from ecl. So given a big enough algebra problem, I
think it could be faster than ecl on its own.
It's pretty neat. I think after finishing a thick wrapper into
cblas from ecl (-> a c lib), I will have a look at making my
own eigen3 or arpack.