Sorting and counting
==========================

Imagine that you are administering a small unix system and you want to
know how many processes each user is running in parallel, and sort the
list in decreasing order of number of processes. The following
one-liner:

 $ ps aux | cut -d " " -f 1 | tail -n +2 | sort | uniq -c | sort -rn

does the trick. Let's dissect it to understand how it works.

The command ps(1) can list all the processes currently running in your
system, together with the name of the user to whom each process belongs:

 $ ps aux
 USER      PID  %CPU %MEM    VSZ   RSS TT  STAT STARTED        TIME COMMAND
 root       11 100.0  0.0      0    16  -  RNL  29Nov18 69599:49.33 [idle]
 root        0   0.0  0.0      0   240  -  DLs  29Nov18     0:13.81 [kernel]
 root        1   0.0  0.0   5424   128  -  ILs  29Nov18     0:01.03 /sbin/init --
 root        2   0.0  0.0      0    16  -  DL   29Nov18     0:00.00 [crypto]
  ...........
  ...........
 uwu     30154   0.0  0.8  11680  8004 27  I+   08:27       0:00.03 /usr/local/bin/lua52 /usr/local/bin/telem.lua uwu
 uwu     27175   0.0  0.5   8520  5320 28  Is   06:35       0:00.03 -zsh (zsh)
 uwu     27178   0.0  0.5   8188  5220 28  S+   06:35       0:58.73 lua /usr/local/bin/odlli (lua52)
 $

That is a fairly long list, but user names appear on the first column,
with other fields separated by (a variable number of) spaces. For the
moment we just need user names, so cut(1) comes handy:

 $ ps aux | cut -d " " -f 1
 USER
 root
 root
 root
 root
  ...........
  ...........
 uwu
 uwu
 uwu
 $

Notice that the first line contains "USER" which is not a real user name
(it's just part of the header added by ps(1)), so we will need to get
rid of it using the command tail(1):

 $ ps aux | cut -d " " -f 1 | tail -n +2
 root
  ...........
 uwu
 $

Now, each user name appears in that list a number of times equal to the
number of processes currently run by the user. How to count these
occurrencies? The trick is to use sort(1) and uniq(1). The command
sort(1) can sort a file (or a list of lines provided as input), and by
default it enforces a lexicographical order:

 $ ps aux | cut -d " " -f 1 | tail -n +2 | sort
 _dhcp
 _pflogd
 bbs
 bbs
 bbs
 ben
 ben
 ben
  ...........
 slugmax
 slugmax
 slugmax
 spring
 uwu
 uwu
 uwu
 uwu
 $

The command uniq(1) will remove contiguous repetitions of each line
given on input:

 $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq
 _dhcp
 _pflogd
 bbs
 ben
 cleber
 irc
 katolaz
 leeb
 lntl
 nobody
 postfix
 root
 slugmax
 spring
 uwu
 $

Notice that this is just the list of users in the system currently
owning at least one running process, which is not exactly what we were
up to. However, the option '-c' of uniq(1) can do the job, since it
counts how many contiguous repetitions of the same line were found:

 $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq -c
    1 _dhcp
    1 _pflogd
    3 bbs
    4 ben
    5 cleber
    1 irc
   22 katolaz
   10 leeb
    3 lntl
    1 nobody
    3 postfix
   56 root
   12 slugmax
    1 spring
    8 uwu
 $

This means that user _dhcp has 1 running process, user cleber has 5
running processes, user root has 56 running processes, and so on. We are
almost there. We just need to sort the resulting list according to the
numbers appearing at the beginning of each line. This is done by using
sort(1) again, with the option '-n':

 $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq -c | sort -n
    1 _dhcp
    1 _pflogd
    1 irc
    1 nobody
    1 spring
    3 bbs
    3 lntl
    3 postfix
    4 ben
    5 cleber
    8 uwu
   10 leeb
   12 slugmax
   23 katolaz
   50 root
 $

If you want the list to to be sorted in descending order of number of
processes, you need to just reverse the ordering, which can be done by
passing the option '-r'  to sort(1):

 $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq -c | sort -rn
   50 root
   23 katolaz
   12 slugmax
   10 leeb
    8 uwu
    5 cleber
    4 ben
    3 postfix
    3 lntl
    3 bbs
    1 spring
    1 nobody
    1 irc
    1 _pflogd
    1 _dhcp
 $

This is the one-liner we had at the beginning of this post. The result
indicates that I should probably close some of the screens I am not
using... :P

 -+-+-+-

Most of the tools we have seen here were forged by the ancient dwarven
blacksmiths at Murray Hill, in the Eastern Lands, and have survived
pretty unmodified in the unix environment for ages. In particular:

sort(1) appeared in UNIXv2 (March 1972)
uniq(1) appeared in UNIXv3 (February 1973)
tail(1) appeared in UNIXv7 (January 1979)

Some other tools, instead, were created in the Eastern Lands and
readjusted and perfected by the sapient master craftsmen of the West. In
particular:

ps(1)   appeared in UNIXv4 (November 1973), although the syntax for
       options that we have used here comes from early versions of
       BSD2.x (ca 1979-1980)