TLS overhead

TLS overhead
------------

Right at the end of a recent post[1], sloum voiced his support for TLS
as a core part of a new protocol. I didn't make any mention of TLS in
my recent "protoocl pondering intensifies" series[2,3,4], but based on
earlier stuff I've written[5] it should come as no surprise that I am
all in favour. Perhaps it was disingenuous not to make any mention of
this when I wrote about how my proposed protocol could still be used
over telnet. I don't feel too bad about it, though. For one thing,
I'm *sure* there is something like "telnet for TLS" out there. For
another, telnetability is not actually a super important practical
property. *I've* never actually surfed gopherspace in that way. It's
more of a token property that serves as a "seal of simpliciy".

Anyway, today for the first time I asked myself what the overhead of
mandating TLS for all connections could be. I quickly came across
this write-up[6] which estimated the TLS handshake to need about 6.5
kilobytes. Oof! That was a kick in the teeth. It made a lot of the
stuff I wrote recently seem incredibly naive. I made a big deal out
of serving gophermaps saving 10 or 20 bytes per line compared to a
gopher menu. That benefit would be totally wiped out in most cases by
6.5kB of overhead.

My first thought was that I would need to switch to keeping
connections open for re-use instead of immediately closing them after
the response was sent. This would at least allow spreading the TLS
overhead over several requests. It would be a real shame, though.
It would complicate client and serve programming, and would also
require adding an extra component to the response header, equivalent
to HTTP's "Content-Length" header.

With a little more reading I learned that recent versions of TLS
support session resumption, where subsequent secure connections to the
same server can be established with very low overhead (about 330
bytes). I thought this could save things, but was disappointed to
find that Python's standard library `ssl` module doesn't seem to
support this. I don't want to design into the protocol a feature of
TLS which is not widely supported in high-quality libraries for
popular languages. Of course, some clients might be able to make use
of this, and I'd encourage it.

With yet more reading of the linked article, I relaxed a little bit.
That 6.5kB estimate is based on some assumptions specific to the
modern web. In particular, it assumes the server sending a chain of 4
certificates to a trusted root certificate. My plan from the start
for this protocol has been to shun the certificate authority system
used by the web in favour of a much simpler and less hierarchical
"TOFU" system similar to SSH: the first time a client connects to a
server, it accepts whatever certificate it gets, but remembers it, and
raises the alarm if the same server offers up a different certificate
in future. This would allow servers to send only a single,
self-signed certificate, which the article states can be as small as
800 bytes. So, maybe we can get a typical case around 1kB. That's
still relatively heavy, but it's a heck of a lot better than 6.5kB. I
think 1kB is acceptably low that I would rather swallow it than add
complexity by switching to a proocol oriented around reusing
connections for multiple requests.

On the face of it, an unavoidable 1kB overhead on every connection
would seem like a license to not care so much about saving a 10 or 20
bytes in the response header. I don't want to fall into that trap,
though. For one thing, TLS session resumption might become a much
more widely supported feature in the future, in which case the
overhead might become a lot lower. For another, it's possible that
some people (e.g. retrocomputing fans) might want to run Gemini
unencrypted. Rest assured this will be in violation of the spec, but
folks doing it will be guilty of precisely the same sin that I'm
guilty of for including TLS support in VF-1, so I can't really
complain. So long as they do it on some non-standard port, that's
their perogative.

So, I'm still in favour of mandating TLS, but a lot of reading and a
lot of care is going to be needed to specify using it in a way that
minimises overhead. All part of the fun.

There's another kind of overhead associated with TLS, beyond the
network traffic, and that's the implementation overhead. This is a
big concern of rain[7], who points out that it's totally impractical
for individual programmers to implement TLS (I fully agree), that they
will need to use libraries, and that it violates the spirit of gopher
to make the implementation so complex that a normal programmer can't
implement it in a weekend.

I'm hugely sympathetic to these concerns. One of the stated design
criteria for Gemini in the FAQ is that:

> A client comfortable for daily use which implements every single
> protocol feature should be a feasible weekend programming project
> for a single developer.

I don't think that relying on TLS conflicts with this. High-level TLS
support is now present in the standard libraries for Python and Go. I
am sympathetic to developers who like to avoid third-party
dependencies at all costs (VF-1 only "softly" depends on chardet), but
not using the *standard library* of your language doesn't make a lot
of sense. Here's how to do a TLS connection in Python 3, assuming the
variable `s` is a regular TCP socket, already connected, of exactly
the kind you'd need to construct if Gemini didn't depend on tls:

----------
import ssl

context = ssl.create_default_context()
s = context.wrap_socket(s)
----------

It's three additional lines of code. Yes, this uses all the default
settings and you are trusting the Python standard library developers
to have chosen sane and secure defaults. Even if you think you know
better than them and want to manually specify some things, you're not
talking about more than 10 lines of code total in all likelihood.
Python might be ahead of the game here (I honestly don't know), and
this might be trickier in other languages, but I strongly suspect it's
only going to get easier, on average, over time. Hopefully it will
also get easier to link these languages against OpenBSD's LibreSSL
instead of OpenSSL, so that the amount of code and the complexity of
code this pulls in will decrease..

I completely understand the decreased feeling of satisfaction and
self-sufficiency that comes from having critical functionality
provided by a large chunk of complex code that you didn't write
yourself. Though, let's be honest with ourselves - gopher clients
which don't have to worry about TLS are still sitting atop the OSes
TCP/IP stack, DNS library, filesystem and a bunch of other stuff that
the average person has no hope of implementing well in a weekend. I
don't see that relying on your programming language's standard library
is cheating any more than relying on your operating system is.

I'd love a simpler, lighter alternative, but realistically I don't see
any which is going to do the job. Rolling your own crypto is fraught
with peril. SSL libraries may be large and complex, but they exist in
just about any language and they are used and tested by a lot of
people, many of whom know more about what they are doing than the
average developer who might implement Gemini. Anything else is almost
guaranteed to be less portable and less well vetted. I'm open to
concrete suggestions if anybody has them, but for now I still think
TLS is our best bet.

[1] gopher://colorfield.space:70/0/~sloum/phlog/190619.txt
[2] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies.txt
[3] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies-ii.txt
[4] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies-iii.txt
[5] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/why-gopher-needs-crypto.txt
[6] http://netsekure.org/2010/03/tls-overhead/
[7] gopher://tilde.team:70/0/~rain1/phlog/20190608-encrypting-gopher.txt