## Forced Labour in Academia - How and Why

- I am Athas, general Bitreich hangaround for a few years.

- When not on #bitreich, I am an assistant professor at the University
 of Copenhagen.

- I am very peculiar about my software, and Bitreich and academia are
 two places where I can get away with it.

In this talk I will speak about my experiences getting students to
write code for me.

## Why should you care?

- If you are a computer science teacher (probably not so many).

- If you are a student (maybe some of you).

- Maybe my experiences are a little interesting.

- Maybe my experience are also relevant for hobby projects.

## The bureaucratic context

- Three years of bachelors study, finishing in a bachelors project.

- Two years of master's study, finishing in a master's thesis.

- (Then maybe a PhD, but that's outside of the scope of this talk.)

#pause

Each student must do two big projects.

Can also do projects instead of courses.

## How many true projects I supervise per year

  +-----------------------------------------------------------+
  |                                                           |
10 |-+                                =                        |
  |                                  =                        |
  |                        =         =                        |
8 |-+                      =         =                        |
  |                        =         =                        |
  |                        =         =         =    =         |
  |                        =         =         =    =         |
6 |-+            =         =         =         =    =    =    |
  |              =         =         =         =    =    =    |
  |              =         =         =         =    =    =    |
  |              =         =         =         =    =    =    |
4 |-+            =         =    =    =    =    =    =    =    |
  |              =         =    =    =    =    =    =    =    |
  |              =    =    =    =    =    =    =    =    =    |
2 |-+  =         =    =    =    =    =    =    =    =    =    |
  |    =         =    =    =    =    =    =    =    =    =    |
  |    =    =    =    =    =    =    =    =    =    =    =    |
  |    =    =    =    =    =    =    =    =    =    =    =    |
0 +-----------------------------------------------------------+
 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025

## What do I get out of it?

- Free labour that does things I don't have time to do myself.

- Fulfilling my employment obligations.

- Altruism.

## What do students get out of it?

- The university says they have to do projects to graduate.

- But many students also like to do something that matters.

#pause

There is some competition:

- Why do my projects instead of those of some other researcher?

- Why do a project instead of a course?

## What do students need in a project?

- Has to be doable in the allotted time.

 - Typically four months, half time or full time.

- Needs to have academic depth.

- Needs to be relatively well-specified from the start.

- Some kind of confidence that they will receive decent supervision.

Motivations differ, but they often want to do something real.

## I am picky

- I supervise only the better students.

- The department has other supervisors who supervise sausage factory
 projects.

 - E.g. "implement some textbook algorithm and see whether practical
   performance matches the theory".

 - My projects are not just busywork!

- Only a fraction of our students can contribute productively to a
 real research project.

## My work

Most of my research centers around a programming language called
Futhark - https://futhark-lang.org :

   def dotprod [n] (xs: [n]f64) (ys: [n]f64): f64 =
     reduce (+) 0 (map2 (*) xs ys)

   def matmul [n][m][p] (a: [n][m]f64) (b: [m][p]f64): [n][p]f64 =
     map (\a_row ->
            map (\b_col ->
                   dotprod a_row b_col)
                (transpose b))
         a

- Data-parallel, purely functional, high-performance programming
 language.

- Runs on GPUs and multicore CPUs.

- Not general-purpose - compiles to library code that you call from C,
 Python, SML, whatever...

## What we research

- Programming language design:

 - E.g. design of type systems.

- Compiler optimisations:

 - E.g. how to map high level programs to low level hardware.

- Parallel algorithms:

 - E.g. how do you express otherwise-sequential stuff (e.g. parsing)
   using a parallel vocabulary?

 - And how fast is it in practice?

## Is it Unix?

- Yes: does one thing well!

#pause

- Yes: integrates easily with C!

 - 'futhark c --library foo.fut' produces

   - foo.c
   - foo.h

       with fairly straightforward API:
       https://futhark.readthedocs.io/en/latest/c-api.html

#pause

- No: compiler consists of almost 100k lines of Haskell.

## So what do projects look like?

- Project is mature, so little low-hanging fruit anymore.

- As a starting point, anything you can imagine for a similar free
 software project.

- Most projects are

 - applications (benchmarks),

 - backend work,

 - or tools.

## Structure of a traditional compiler

- Frontend:

 - Parsing, type checking, desugaring...

- Middle-end:

 - Optimisations and other transformations.

 - By far the largest.

 - Sort of Unix.

- Backend:

 - Code generation and runtime systems.

The frontend and backend are best places for students to work.

Also more immediately motivating.

## Example of a project: CUDA backend

- Short two-month project.

- Implemented by imitating existing backend and slotted into very end
 of compiler.

- Extremely useful to our work, as (proprietary!) CUDA is a sort of
 standard.

- Remains maintained and highly used.

Similarly: multicore backend, ISPC backend, ...

## Example of a project: C# backend

- Long MSc project.

- Good work.

- Later removed because we didn't have a real use for it.

## Example of a project: Language Server

- Completely separate (sub-)program: 'futhark lsp'.

- Makes use of existing parser, typechecker to extract program
 information - limited (but existing) interface.

- At end of project, effect is very visible: colourful popups in
 VSCode and whatnot.

 - (Turns out Emacs also has a very nice and simple LSP client built
   in.)

## Example of a project: Reactive Benchmarking

- Much of my work requires me to measure how fast a program is.
- This is surprisingly tricky!

 - Shorter runtimes need more runs.
 - Performance can vary over time.

        20 +-----------------------------------------------+
           |                               ****************|
        18 |                              *                |
        16 |                             *                 |
        14 |                             *                 |
        12 |                           *                   |
           |                           *                   |
        10 |***************************                    |
           +-----------------------------------------------+
           1    2     3    4    5     6    7    8     9    10
                                sample

- Had a student look into statistical wizardry to figure out how many
 samples are needed, and whether performance has "plateaued".

 - Implemented in our automated 'futhark bench' tool.

- I use this work all the time now - as do my collaborators.

## Example of a project: New Fusion Engine

- Fusion is a critical optimisation that merges adjacent operations
 to avoid intermediate results.

       map f (map g x)    =>     map (f o g) x

- We had a fusion engine for a long time that conceptually optimised a
 data dependency graph by merging nodes when possible, based on
 various rewrite rules:

   1    2  3         1    2           1
   |    |  |         |    |           |
    \  /   4     =>   \  /  3+4  =>   |     3+4
     \/    |           \/    |        |      |
     6-----/           6-----/       2+6-----/

- But our implementation was crap and old and did not really match the
 (fairly nice) theoretical algorithm.

- Project was about rewriting this entire transformation without
 changing existing behaviour.

## More metrics

- 52 projects in total:

 - 15 Msc
 - 31 BSc
 - 6 others

- 15 unrelated to language itself

   - Benchmarks and such

- 22 Ended up merged into compiler itself

## When it goes wrong

- Sometimes I don't vet a student properly

 - Or they misrepresent their capabilities.

- This sucks.

- Switch project focus to maximise odds of passing (with low grade).

## Conclusions

- My research is not optimal for student engagement.

- But I do manage to attract talented students.

- They do useful work I don't do myself.

 - Loose coupling of program components is key.

- I think the guarantee of diligent supervision and help is an
 attractor.

   - Could Bitreich do something similar?

   - Maybe I could supervise a Bitreich-relevant project?
      (Actually considering that for the embryonic 'energy' tool.)