Some musings on coroutines

* * * * *

Some musings on coroutines

Over a year ago [1], having nothing else better to do (seeing how the power
was out), I did some research into co-routines [2]. I figure now is as good a
time as any to talk about some conclusions I came to (seeing how all that
programming talk yesterday is inspiring).

> Subroutines are special cases of more general program components, called
> “coroutines.” In contrast to the unsymmetric relationship between a main
> routine and a subroutine, there is complete symmetry between coroutines,
> which call on each other.
>
> To understand the coroutine concept, let us consider another way of
> thinking about subroutines. The viewpoint adopted in the previous section
> was that a subroutine merely was an extension of the computer hardware,
> introduced to save lines of coding. This may be true, but another point of
> view is possible: We may consider the main porogram and the subroutine as a
> team of programs, with each member of the team having a certain job to do.
> The main program, in the course of doing its job, will activate the
> subprogram; the subprogram performs its own function and then activates the
> main program. We might stretch our imagination to believe that, from the
> subroutine's point of view, when it exits it is calling the main routine;
> the main routine continues to perform its duty, then “exits” to the
> subroutine. The subroutine acts, then calls the main routine again.
>
> This somewhat far-fetched philosophy actually takes place with coroutines,
> when it is impossible to distinguish which is a subroutine of the other.
> Suppose we have coroutines A and B; when programming A, we may think of B
> as our subroutine, but when programming B, we may think of A as our
> subroutine. … It represents teamwork as in a relay race. Whenever a
> coroutine is activated, it resumes execution of its program at the point
> where the action was last suspended.
>
> “_The Art of Computer Programming_ [3], § 1.4.2”
>

When researching just about anything dealing with computers, it's probably
best to start with the source, Donald Knuth [4]. § 1.4.2 covers the basics of
coroutines and gives an example, which would look something like this (when
converted from MIX assembly, used for all the examples in the book, to a C-
like langauge, which is a bit easier to understand):

> int input(void)
> {
> int c;
>
> while(1)
> {
> c = getchar();
> if (isdigit(c))
> {
> int count = c - '0' + 1;
> c = getchar();
> for ( ; count ; count--)
> yield to output(c);
> }
> else
> yield to output(c);
> if (c == '.')
> return;
> }
> }
>
> void output(int c)
> {
> int count = 0;
> int c;
>
> while(1)
> {
> yield to c = input();
> putchar(c);
> if (c == '.')
> return;
> count++;
> if (count == 3)
> {
> putchar(' ');
> count = 0;
> }
> }
> }
>

The thing to understand about coroutines is that it isn't a normal subroutine
call. Normally, each time input() is called, control is passed to the first
line of the routine, but in a coroutine “call,” control is initially passed
to the first line, but each subsequent “call” resumes where input() left off
(in this case, marked with the keyword yield). If we change the code a bit
(remember, this isn't C, but a C-like langauge), this can become clearer:

> void input()
> {
> while(1)
> {
> c = getchar();
> if (isdigit(c))
> {
> int count = c - '0' + 1;
> c = getchar();
> for ( ; count ; count--)
> send c to output;
> }
> else
> send c to output;
> if (c == '.')
> return;
> }
> }
>
> void output()
> {
> int count = 0;
> int c;
>
> while(1)
> {
> receive c from input;
> putchar(c);
> if (c == '.')
> return;
>
> count++;
> if (count == 3)
> {
> putchar(' ');
> count = 0;
> }
> }
> }
>

Here, it should be a bit clearer what is going on—input() is getting some
data, then passing it to the coroutine output(). I've changed the way that
coroutines are “called” for two reasons. One, it's a bit clearer what is
going on, and two, there's a bit more going on here than just very fancy
gotos. In § 1.4.2, Knuth gives the following illustration (recreated here):

[Fig. 22 from § 1.4.2 of “The Art of Computer Programming”] [5]
Passes: (a) a four-pass algorithm, and (b) a one-pass algorithm.

Those familiar with Unix should see a familiar pattern here:

> GenericUnixPrompt> pass-a <input >tape.1
> GenericUnixPrompt> pass-b <tape.1 >tape.2
> GenericUnixPrmpot> pass-c <tape.2 >tape.3
> GenericUnixPrompt> pass-d <tape.3 >output
>

Or, more succinctly:

> GenericUnixPrompt> pass-a < input | pass-b | pass-c | pass-d >output
>

Coroutines are nothing more than routines that communicate via pipes! A type
of message passing, if you will.

Not only that, but given the rise of multi-core processors, this appears to
be a natural fit for multithreaded programs. A piped Unix command line uses a
separate process for each component, so it appears to be a no-brainer that
each component in a coroutine chain can be given its own thread. With that,
and some syntactic help, we can rewrite Knuth's example and extend it a bit:

> void input()
> receive char;
> send char;
> {
> char c;
>
> while(1)
> {
> receive c;
> if (isdigit(c))
> {
> int count = c - '0' + 1;
> receive c;
> for ( ; count ; count --)
> send c;
> }
> else
> send c;
>
> if (c == '.')
> return;
> }
> }
>
> void output()
> receive char;
> send char;
> {
> int count = 0;
> char c;
>
> while(1)
> {
> receive c;
> send c;
> if (c == '.')
> return;
>
> count++;
> if (count == 3)
> {
> send ' ';
> count = 0;
> }
> }
> }
>
> void filter(char from,char to)
> receive char;
> send char;
> {
> while(1)
> {
> accept c;
> if (c == from)
> c = to;
> send c;
> }
> }
>
> int main(void)
> {
> string in;
> string out;
>
> in = get_input();
> in => input()
> => filter('a','n')
> => filter('e','x')
> => output() => out;
> print(out);
> }
>

Coroutines make the same fringe problem [6] trivial to implement:

> void tree_leaves(Tree t)
> send Tree;
> {
> if (t == nil)
> send nil;
> if (is_leaf(t))
> send t;
> send tree_leaves(t->left);
> send tree_leaves(t->right);
> }
>
> bool same_fringe(Tree t1,Tree t2)
> {
> Tree tmp1;
> Tree tmp2;
>
> while
> (
> (tree_leaves(t1) => tmp1)
> && (tree_leaves(t2) => tmp2)
> )
> {
> if (tmp1 != tmp2)
> return (false);
> }
> return true;
> }
>

There are even more implications here. The producer/consumer problem [7] is
one used to teach semaphores, but with coroutines, it becomes rather trivial:

> void producer()
> send datum;
> {
> datum data;
>
> while(1)
> {
> data = generate_datum();
> send data;
> }
> }
>
> void consumer()
> receive datum;
> {
> datum data;
>
> while(1)
> {
> receive data;
> crunch_datum(data);
> }
> }
>
> producer() <=> consumer();
>

Even with multiple consumers, I don't see a problem with this model:

> producer() <=> (consumer(), consumer());
>

And other problems used to teach semaphores, like the multiple-reader,
exclusive writer problem, become trivial:

> void master_control_program()
> receive command;
> receive datum;
> send datum;
> {
> command cmd;
> datum data;
>
> while(1)
> {
> accept cmd;
> if (cmd == read)
> send data;
> else if (cmd == write)
> receive data;
> }
> }
>
> void reader(int id)
> {
> datum data;
>
> while(1)
> {
> send read;
> receive data;
> process(data,id);
> }
> }
>
> void writer(int id)
> {
> datum data;
>
> while(1)
> {
> sleep(random());
> data = generate_datum(id);
> send write;
> send data;
> }
> }
>
> master_control_program() <=> ( reader(1),
> reader(2),
> reader(3),
> writer(1),
> reader(4),
> writer(2),
> writer(3),
> reader(5));
>

But in retrospect, this is nothing more than a reinvention of communicating
sequential processes [8], and what I've managed to do is basically hide
message passing within the framework of coroutines, which is why I suddenly
found myself easily solving problems that normally require semaphores.

Granted, there aren't any langauges that do this sort of thing (well, there
is Erlang [9], but it's a functional language, not imperative like C or C++),
and if I were to implement a language with coroutine support like this, there
are still issues of implementation and syntax to work out, but I'm still
playing around with this stuff for now.

[1] gopher://gopher.conman.org/0Phlog:2005/10/26.1
[2] http://c2.com/cgi/wiki?CoRoutine
[3] http://www.amazon.com/gp/product/0201485419?ie=UTF8&tag=conmanlaborat-20&link_code=as3&camp=211189&creative=373489&creativeASIN=0201485419
[4] http://www-cs-faculty.stanford.edu/~knuth/
[5] gopher://gopher.conman.org/gPhlog:2007/01/14/fig22.gif
[6] http://c2.com/cgi/wiki?SameFringeProblem
[7] http://www.codeproject.com/threads/ProducerConsumerModel.asp
[8] http://c2.com/cgi/wiki?CommunicatingSequentialProcesses
[9] http://www.erlang.org/

Email author at [email protected]