* * * * *

            Managing TLS connections using Lua and Lua coroutines

Getting libtls [1] working with Lua [2] wasn't as straightforward and I
thought it would be. It works (for the most part) but I had to change my
entire approach. The code is an ugly mess and there's quite a bit of
duplication in several spots.

But! I can request web pages, in Lua [3], via HTTPS (HyperText Transfer
Protocol-Secure) in an event loop based around select() (or poll() or epoll()
or whatever is the low level event notification scheme used). Woot! And I'm
going into excruciating detail on this.

Back on Friday, when I wrote some “proof-of-concept [4]” code, I had thought
I could switch coroutines [5] in the user-supplied I/O (Input/Output)
callback routines [6] (and if coroutines existed in C [7], that is where you
would yield to another coroutine). It was easy enough to extend the callback
to a Lua routine— in the routine that wraps the libtls function
tls_connect_cbs():

-----[ C ]-----
static int Ltls_connect_cbs(lua_State *L)
{
 struct tls **tls = lua_touserdata(L,1);
 int rc           = tls_connect_cbs(
                       *tls,
                       Xtls_read,
                       Xtls_write,
                       L,
                       luaL_checkstring(L,2)
                    );

 if (rc != 0)
 {
   lua_pushboolean(L,false);
   return 1;
 }

 lua_settop(L,5);
 lua_pushlightuserdata(L,*tls);
 lua_getuservalue(L,1);
 lua_pushvalue(L,1);
 lua_setfield(L,-2,"_ctx");
 lua_pushvalue(L,2);
 lua_setfield(L,-2,"_servername");
 lua_pushvalue(L,3);
 lua_setfield(L,-2,"_userdata");
 lua_pushvalue(L,4);
 lua_setfield(L,-2,"_readf");
 lua_pushvalue(L,5);
 lua_setfield(L,-2,"_writef");

 lua_settable(L,LUA_REGISTRYINDEX);
 lua_pushboolean(L,true);
 return 1;
}
-----[ END OF LINE ]-----

I pass in the two callback functions, and I'm using the Lua state context as
the userdata in the callbacks. I then create a Lua table, populate it with
some useful information, such as the Lua functions to call, and associate it
in the Lua registry with the value of the libtls context. Then, when libtls
calls one of the callbacks:

-----[ C ]-----
static ssize_t Xtls_write(struct tls *tls,void const *buf,size_t buflen,void *cb_arg)
{
 lua_State *L = cb_arg;
 ssize_t    len;

 lua_pushlightuserdata(L,tls);
 lua_gettable(L,LUA_REGISTRYINDEX);
 lua_getfield(L,-1,"_writef");
 lua_getfield(L,-2,"_ctx");
 lua_pushlstring(L,buf,buflen);
 lua_getfield(L,-4,"_userdata");
 lua_call(L,3,1);

 len = lua_tonumber(L,-1);
 lua_pop(L,2);
 return len;
}
-----[ END OF LINE ]-----

I get the Lua state via the user argument. From that, and the libtls context,
I obtain the data I cached into the Lua table, which give me the Lua function
to call. Said function can then call coroutine.yield().

Straightforward, easy, and wrong! I got the dreaded “attempt to yield across
metamethod/C-call boundary” error. Darn.

The attempted flow looks like (yellow boxes are Lua functions; green boxes
are C functions):

{data=tls.read()} → [Ltls_read(lua)] → [tls_read(ctx)] → [Xtls_read(ctx,lua)]
→ [lua_call(lua)] → {my_callback()} → {coroutine.yield()} {}=Lua function
[]=C function [8]

There are four layers of C functions that can't be yielded through. Lua does
have a way of dealing with intervening C functions, but it's somewhat clunky.

{luaf_a()} → [cf_orig(lua)] → [lua_callk(lua,cf_c)] → {luaf_b()} →
{coroutine.yield} / {coroutine.resume} → {luaf_b()*} → [cf_c(lua)*] →
{luaf_a()} [9]

In this case, the Lua function lua_callk() [10] is handled specially [11] so
it doesn't cause an error. The function cf() needs to be split in half—the
portion prior to calling into Lua, and the second half to handle things after
a potential call to coroutine.yield(). That's represented above by the
functions cf_orig() and cf_c(). The “*” represent the functions returning,
not calling. coroutine.resume() will restart luaf_b() right after it's call
to coroutine.yield(). And when luaf_b() returns, it “returns” to cf_c(),
which does whatever and finally returns, which “returns” to luaf_a().

But in the case I'm dealing with just doesn't work with that model. The code
calling into Lua doesn't have the signature:

-----[ C ]-----
int function(lua_State *lua_State);
-----[ END OF LINE ]-----

but the signature:

-----[ C ]-----
ssize_t function(struct tls *ctx,void *buf,size_t buflen,void *cb_arg);
-----[ END OF LINE ]-----

Not only are the return types different, but they have completely different
semantics—for libtls, it's the number of bytes transferred, whereas for Lua,
it's the number of items being returned to Lua.

No, I had to rethink the entire approach, and do the call to
coroutine.yield() a bit higher in the call stack. Which also meant I had to
push dealing with TLS_WANT_POLLIN and TLS_WANT_POLLOUT back to the caller.
The documentation states:

> * TLS_WANT_POLLIN The underlying read file descriptor needs to be readable
>   in order to continue.
> * TLS_WANT_POLLOUT The underlying write file descriptor needs to be
>   writeable in order to continue.
>
> In the case of blocking file descriptors, the same function call should be
> repeated immediately. In the case of non-blocking file descriptors, the
> same function call should be repeated when the required condition has been
> met.
>

And here I was, trying to hide such concerns from the user. Ah well.

I eventually got it working, but man, is it ugly. The Lua code wants to read
data, so I have to call into libtls. That in turn, calls back into my code,
and if I don't have any, I need to return TLS_WANT_POLLIN, which bubbles up
through libtls back to my code, which can then yield.

Meanwhile, from the other end, I get data from the network. I can't just feed
it into libtls, I have to feed it when libtls calls the callback for the
data. But when I get the data, I may need to resume the coroutine, so I have
to track that information as well.

I can almost understand the code (and yes, I wrote it; did I mention it's
ugly?)

But I'm happy. The following code works in my existing network framework (boy
does that sound wierd):

-----[ Lua ]-----
local function request(item)
 syslog('debug',"requesting %s",item.url)
 local u = url:match(item.url)

 -- -------------------------------------------------------
 -- asynchronous DNS lookup---blocks the current coroutine
 -- until a result is returned via the network.
 -- -------------------------------------------------------

 local addr = dns.address(u.host,'ip','tcp',u.port)

 if not addr then
   syslog('error',"finished %s---could not look up address",u.host)
   return
 end

 -- ---------------------------------------------------------
 -- This has nothing to do with the iPhone operating system,
 -- but everything to do with "Input/Output Stream"
 -- ---------------------------------------------------------

 local ios

 if u.scheme == 'http' then
   ios = tcp.connecta(addr[1]) -- connect via TCP
 else
   ios = tls.connecta(addr[1],u.host) -- connect via TLS
 end

 if not ios then
   syslog('error',"could not connect to %s",u.host)
   return
 end

 local path    = table.concat(u.path,'/')
 local fhname  = "header/" .. item.hdr
 local fbname  = "body/"   .. item.body
 local fh      = io.open(fhname,"w")
 local fb      = io.open(fbname,"w")

 local command = string.format([[
GET /%s HTTP/1.0
Host: %s
Connection: close
User-Agent: TLSTest/2.0 (Lua TLS Testing Program)
Accept: */*

]],path,u.host)

 ios:write(command)

 fh:write(ios:read("*h"))

 repeat
   local data = ios:read("*a")
   fb:write(data)
 until data == ""

 fb:close()
 fh:close()
 ios:close()

 syslog('debug',"finished %s %s",item.url,tostring(addr[1]))
end
-----[ END OF LINE ]-----

Any number of requests can be started and they all run concurrently, which is
just what I wanted.

Now, the code I have for the Lua wrapper for libtls covers just what I need
to do this. More work is required to finish covering the rest of the API
(Application Programming Interface). I also have to clean up the Lua code
that backs the above sample code so that I might have a chance of
understanding it at some point in the future.

And until I get the working code published, you can look at the “proof-of-
concept” Lua coroutine code [12] I worked from (and no, the above code sample
will not work as is with this “proof-of-concept” code).

[1] https://man.openbsd.org/tls_init.3
[2] gopher://gopher.conman.org/0Phlog:2018/07/19.1
[3] http://www.lua.org/
[4] https://github.com/spc476/libtls-examples
[5] https://en.wikipedia.org/wiki/Coroutine
[6] https://github.com/spc476/libtls-examples/blob/c9df11bbe1f57c1a5d0efe0896ef3fb131d44ad9/get3.c#L114
[7] gopher://gopher.conman.org/0Phlog:2017/02/27.1
[8] gopher://gopher.conman.org/gPhlog:2018/07/23/callback.gif
[9] gopher://gopher.conman.org/gPhlog:2018/07/23/coroutine.gif
[10] https://www.lua.org/manual/5.3/manual.html#lua_callk
[11] https://www.lua.org/manual/5.3/manual.html#4.7
[12] https://github.com/spc476/libtls-examples/blob/fca85f268c8a2d7e9a913d537d4acb067df5b6c8/Lua/get4.lua

Email author at [email protected]