## Forced Labour in Academia - How and Why
- I am Athas, general Bitreich hangaround for a few years.
- When not on #bitreich, I am an assistant professor at the University
of Copenhagen.
- I am very peculiar about my software, and Bitreich and academia are
two places where I can get away with it.
In this talk I will speak about my experiences getting students to
write code for me.
## Why should you care?
- If you are a computer science teacher (probably not so many).
- If you are a student (maybe some of you).
- Maybe my experiences are a little interesting.
- Maybe my experience are also relevant for hobby projects.
## The bureaucratic context
- Three years of bachelors study, finishing in a bachelors project.
- Two years of master's study, finishing in a master's thesis.
- (Then maybe a PhD, but that's outside of the scope of this talk.)
#pause
Each student must do two big projects.
Can also do projects instead of courses.
## How many true projects I supervise per year
+-----------------------------------------------------------+
| |
10 |-+ = |
| = |
| = = |
8 |-+ = = |
| = = |
| = = = = |
| = = = = |
6 |-+ = = = = = = |
| = = = = = = |
| = = = = = = |
| = = = = = = |
4 |-+ = = = = = = = = |
| = = = = = = = = |
| = = = = = = = = = |
2 |-+ = = = = = = = = = = |
| = = = = = = = = = = |
| = = = = = = = = = = = |
| = = = = = = = = = = = |
0 +-----------------------------------------------------------+
2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
## What do I get out of it?
- Free labour that does things I don't have time to do myself.
- Fulfilling my employment obligations.
- Altruism.
## What do students get out of it?
- The university says they have to do projects to graduate.
- But many students also like to do something that matters.
#pause
There is some competition:
- Why do my projects instead of those of some other researcher?
- Why do a project instead of a course?
## What do students need in a project?
- Has to be doable in the allotted time.
- Typically four months, half time or full time.
- Needs to have academic depth.
- Needs to be relatively well-specified from the start.
- Some kind of confidence that they will receive decent supervision.
Motivations differ, but they often want to do something real.
## I am picky
- I supervise only the better students.
- The department has other supervisors who supervise sausage factory
projects.
- E.g. "implement some textbook algorithm and see whether practical
performance matches the theory".
- My projects are not just busywork!
- Only a fraction of our students can contribute productively to a
real research project.
## My work
Most of my research centers around a programming language called
Futhark -
https://futhark-lang.org :
def dotprod [n] (xs: [n]f64) (ys: [n]f64): f64 =
reduce (+) 0 (map2 (*) xs ys)
def matmul [n][m][p] (a: [n][m]f64) (b: [m][p]f64): [n][p]f64 =
map (\a_row ->
map (\b_col ->
dotprod a_row b_col)
(transpose b))
a
- Data-parallel, purely functional, high-performance programming
language.
- Runs on GPUs and multicore CPUs.
- Not general-purpose - compiles to library code that you call from C,
Python, SML, whatever...
## What we research
- Programming language design:
- E.g. design of type systems.
- Compiler optimisations:
- E.g. how to map high level programs to low level hardware.
- Parallel algorithms:
- E.g. how do you express otherwise-sequential stuff (e.g. parsing)
using a parallel vocabulary?
- And how fast is it in practice?
## Is it Unix?
- Yes: does one thing well!
#pause
- Yes: integrates easily with C!
- 'futhark c --library foo.fut' produces
- foo.c
- foo.h
with fairly straightforward API:
https://futhark.readthedocs.io/en/latest/c-api.html
#pause
- No: compiler consists of almost 100k lines of Haskell.
## So what do projects look like?
- Project is mature, so little low-hanging fruit anymore.
- As a starting point, anything you can imagine for a similar free
software project.
- Most projects are
- applications (benchmarks),
- backend work,
- or tools.
## Structure of a traditional compiler
- Frontend:
- Parsing, type checking, desugaring...
- Middle-end:
- Optimisations and other transformations.
- By far the largest.
- Sort of Unix.
- Backend:
- Code generation and runtime systems.
The frontend and backend are best places for students to work.
Also more immediately motivating.
## Example of a project: CUDA backend
- Short two-month project.
- Implemented by imitating existing backend and slotted into very end
of compiler.
- Extremely useful to our work, as (proprietary!) CUDA is a sort of
standard.
- Remains maintained and highly used.
Similarly: multicore backend, ISPC backend, ...
## Example of a project: C# backend
- Long MSc project.
- Good work.
- Later removed because we didn't have a real use for it.
## Example of a project: Language Server
- Completely separate (sub-)program: 'futhark lsp'.
- Makes use of existing parser, typechecker to extract program
information - limited (but existing) interface.
- At end of project, effect is very visible: colourful popups in
VSCode and whatnot.
- (Turns out Emacs also has a very nice and simple LSP client built
in.)
## Example of a project: Reactive Benchmarking
- Much of my work requires me to measure how fast a program is.
- This is surprisingly tricky!
- Shorter runtimes need more runs.
- Performance can vary over time.
20 +-----------------------------------------------+
| ****************|
18 | * |
16 | * |
14 | * |
12 | * |
| * |
10 |*************************** |
+-----------------------------------------------+
1 2 3 4 5 6 7 8 9 10
sample
- Had a student look into statistical wizardry to figure out how many
samples are needed, and whether performance has "plateaued".
- Implemented in our automated 'futhark bench' tool.
- I use this work all the time now - as do my collaborators.
## Example of a project: New Fusion Engine
- Fusion is a critical optimisation that merges adjacent operations
to avoid intermediate results.
map f (map g x) => map (f o g) x
- We had a fusion engine for a long time that conceptually optimised a
data dependency graph by merging nodes when possible, based on
various rewrite rules:
1 2 3 1 2 1
| | | | | |
\ / 4 => \ / 3+4 => | 3+4
\/ | \/ | | |
6-----/ 6-----/ 2+6-----/
- But our implementation was crap and old and did not really match the
(fairly nice) theoretical algorithm.
- Project was about rewriting this entire transformation without
changing existing behaviour.
## More metrics
- 52 projects in total:
- 15 Msc
- 31 BSc
- 6 others
- 15 unrelated to language itself
- Benchmarks and such
- 22 Ended up merged into compiler itself
## When it goes wrong
- Sometimes I don't vet a student properly
- Or they misrepresent their capabilities.
- This sucks.
- Switch project focus to maximise odds of passing (with low grade).
## Conclusions
- My research is not optimal for student engagement.
- But I do manage to attract talented students.
- They do useful work I don't do myself.
- Loose coupling of program components is key.
- I think the guarantee of diligent supervision and help is an
attractor.
- Could Bitreich do something similar?
- Maybe I could supervise a Bitreich-relevant project?
(Actually considering that for the embryonic 'energy' tool.)