| libzahls-design.tex - libzahl - big integer library | |
| git clone git://git.suckless.org/libzahl | |
| Log | |
| Files | |
| Refs | |
| README | |
| LICENSE | |
| --- | |
| libzahls-design.tex (7474B) | |
| --- | |
| 1 \chapter{libzahl's design} | |
| 2 \label{chap:libzahl's design} | |
| 3 | |
| 4 In this chapter, the design of libzahl is discussed. | |
| 5 | |
| 6 \vspace{1cm} | |
| 7 \minitoc | |
| 8 | |
| 9 | |
| 10 \newpage | |
| 11 \section{Memory pool} | |
| 12 \label{sec:Memory pool} | |
| 13 | |
| 14 Allocating memory dynamically is an expensive operation. | |
| 15 To improve performance, libzahl never deallocates memory | |
| 16 before the library is uninitialised, instead it pools | |
| 17 memory, that is no longer needed, for reuse. | |
| 18 | |
| 19 Because of the memory pooling, this is a pattern to the | |
| 20 allocation sizes. In an allocation, a power of two | |
| 21 elements, plus a few elements that are discussed in | |
| 22 \secref{sec:Integer structure}, are allocated. That is, | |
| 23 the number multiplied by the size of an element. | |
| 24 Powers of two (growth factor 2) is not the most memory | |
| 25 efficient way to do this, but it is the simplest and | |
| 26 performance efficient. This power of two (sans the few | |
| 27 extra elements) is used to calculate — getting the index | |
| 28 of the only set bit — the index of the bucket in | |
| 29 which the allocation is stored when pooled. The buckets | |
| 30 are dynamic arrays with the growth factor 1.5. The | |
| 31 growth factor 1.5 is often used for dynamic arrays, it | |
| 32 is a good compromise between memory usage and performance. | |
| 33 | |
| 34 libzahl also avoids allocating memory by having a set | |
| 35 of temporary variables predefined. | |
| 36 | |
| 37 | |
| 38 \newpage | |
| 39 \section{Error handling} | |
| 40 \label{sec:Error handling} | |
| 41 | |
| 42 In C, it is traditional to return a sentiel value | |
| 43 in case an error has occurred, and set the value | |
| 44 of a global variable to describe the error that | |
| 45 has occurred. The programmer can choose whether to | |
| 46 check for errors, ignore errors where it does not | |
| 47 matter, or simple ignore errors altogether and let | |
| 48 the program eventually crash. This is a simple | |
| 49 technique that gives the programmer a better | |
| 50 understanding of what can happen. A great advantage | |
| 51 C has over most programming languages. | |
| 52 | |
| 53 Another technique is to use long jumps on error. | |
| 54 This technique is not too common, but is has one | |
| 55 significant advantage. Error-checks need only be | |
| 56 preformed where the error can first be detected. | |
| 57 There is no need to check the return value at every | |
| 58 function return. This leads to cleaner code, if | |
| 59 there are many functions that can raise exceptional | |
| 60 conditions, and greater performance under some | |
| 61 conditions. This is why this technique is sometimes | |
| 62 used in high-performance libraries. libzahl uses | |
| 63 this technique. | |
| 64 | |
| 65 Rather than writing | |
| 66 | |
| 67 \begin{alltt} | |
| 68 if (zadd(a, b, c)) | |
| 69 goto out; | |
| 70 \end{alltt} | |
| 71 | |
| 72 \noindent | |
| 73 or a bit cleaner, if there are a lot of calls, | |
| 74 | |
| 75 \begin{alltt} | |
| 76 #define TRY(...) do if (__VA_ARGS__) goto out; while (0) | |
| 77 \textcolor{c}{/* \textrm{\ldots} */} | |
| 78 TRY(zadd(a, b, c)); | |
| 79 \end{alltt} | |
| 80 | |
| 81 \noindent | |
| 82 we write | |
| 83 | |
| 84 \begin{alltt} | |
| 85 jmp_buf env; | |
| 86 if (setjmp(env)) | |
| 87 goto out; | |
| 88 zsetup(env); | |
| 89 \textcolor{c}{/* \textrm{\ldots} */} | |
| 90 zadd(a, b, c); | |
| 91 \end{alltt} | |
| 92 | |
| 93 You only need to call {\tt setjmp} and {\tt zsetup} | |
| 94 once, but can update the return point by calling | |
| 95 them once more. | |
| 96 | |
| 97 If you don't need to check for errors, you can | |
| 98 disable error detection at compile-time. By defining | |
| 99 the {\tt ZAHL\_UNSAFE} C preprocessor definition | |
| 100 when compiling libzahl, and when compiling your | |
| 101 software that uses libzahl. | |
| 102 | |
| 103 | |
| 104 \newpage | |
| 105 \section{Integer structure} | |
| 106 \label{sec:Integer structure} | |
| 107 | |
| 108 The data type used to represent a big integer with | |
| 109 libzahl is {\tt z\_t},\footnote{This name actually | |
| 110 violates the naming convention; it should be {\tt Z}, | |
| 111 or {\tt Zahl} to avoid single-letter names. But this | |
| 112 violation is common-place.} defined as | |
| 113 | |
| 114 \begin{alltt} | |
| 115 typedef struct zahl z_t[1]; | |
| 116 \end{alltt} | |
| 117 | |
| 118 \noindent | |
| 119 where {\tt struct zahl} is defined as | |
| 120 | |
| 121 \begin{alltt} | |
| 122 struct zahl \{ | |
| 123 int sign; \textcolor{c}{/* \textrm{\emph{not} short fo… | |
| 124 size_t used; | |
| 125 size_t alloced; \textcolor{c}{/* \textrm{short for `allocate… | |
| 126 zahl_char_t *chars; \textcolor{c}{/* \textrm{short for `characte… | |
| 127 \}; | |
| 128 \end{alltt} | |
| 129 | |
| 130 \noindent | |
| 131 where {\tt zahl\_char\_t} is defined as | |
| 132 | |
| 133 \begin{alltt} | |
| 134 typedef uint64_t zahl_char_t; | |
| 135 \end{alltt} | |
| 136 | |
| 137 \noindent | |
| 138 As a user, try not to think about anything else than | |
| 139 | |
| 140 \begin{alltt} | |
| 141 typedef \textcolor{c}{/* \textrm{ignore what is here} */} z_t[1]; | |
| 142 \end{alltt} | |
| 143 | |
| 144 \noindent | |
| 145 details can change in future versions of libzahl. | |
| 146 | |
| 147 {\tt z\_t} is defined as a single-element array. This | |
| 148 is often called a reference, or a call-by-reference. | |
| 149 There are some flexibility issues with this, why | |
| 150 {\tt struct zahl} has beed added, but for most uses | |
| 151 with big integers, it makes things simpler. Particularly, | |
| 152 you need not work prepend {\tt \&} to variable when making | |
| 153 function calls, but the existence of {\tt struct zahl} | |
| 154 allows you do so if you so choose. | |
| 155 | |
| 156 The {\tt .sign} member, is either $-1$, 0, or 1, | |
| 157 when the integer is negative, zero, or positive, | |
| 158 respectively. Whenever {\tt .sign} is 0, the value | |
| 159 of {\tt .used} and {\tt .chars} are undefined. | |
| 160 | |
| 161 {\tt .used} holds to the number of elements used in | |
| 162 {\tt .chars}, and {\tt .alloced} holds the allocation | |
| 163 side of {\tt .chars} measured in elements minus a few | |
| 164 extra elements that are always added to the allocation. | |
| 165 {\tt .chars} is a little-endian array of 64-bit digits, | |
| 166 these 64-bit digits are called `characters' in libzahl. | |
| 167 {\tt .chars} holds the absolute value of the | |
| 168 represented value. | |
| 169 | |
| 170 Unless {\tt .sign} is 0, {\tt .chars} always contains | |
| 171 four extra elements, refered to as fluff. These are | |
| 172 merely allocated so functions can assume that they can | |
| 173 always manipulate groups of four characters, and need | |
| 174 not care about cases where the number of characters is | |
| 175 not a multiple of four. There are of course a few cases | |
| 176 when the precise number of characters is important. | |
| 177 | |
| 178 | |
| 179 \newpage | |
| 180 \section{Parameters} | |
| 181 \label{sec:Parameters} | |
| 182 | |
| 183 The general order of parameters in libzahl functions | |
| 184 are: output integers, input integers, input data, | |
| 185 output data, parametric values. For example, in | |
| 186 addition, the out parameter is the first parameter. | |
| 187 But for marshalling and unmarshalling the buffer | |
| 188 is last. For random number generation the order is: | |
| 189 output, device, distribution, distribution parameters. | |
| 190 Whilst the distribution parameters are big integers, | |
| 191 they are not considered input integers. The order | |
| 192 of the input parameters are that of the order you | |
| 193 would write them using mathematical notation, this | |
| 194 also holds true if you include the output parameter | |
| 195 (as long as there is exactly one output), for example | |
| 196 | |
| 197 \vspace{1em} | |
| 198 $a \gets b^c \mod d$ | |
| 199 \vspace{1em} | |
| 200 | |
| 201 \noindent | |
| 202 is written | |
| 203 | |
| 204 \begin{verbatim} | |
| 205 zmodpow(a, b, c, d); | |
| 206 \end{verbatim} | |
| 207 | |
| 208 \noindent | |
| 209 or | |
| 210 | |
| 211 \begin{verbatim} | |
| 212 zmodpowu(a, b, c, d); | |
| 213 \end{verbatim} | |
| 214 | |
| 215 Like any self respecting bignum library, libzahl | |
| 216 supports using the same big integer reference as | |
| 217 for output as input, as long as all the output | |
| 218 parameters are mutually unique. For example | |
| 219 | |
| 220 \begin{alltt} | |
| 221 a += b; | |
| 222 \end{alltt} | |
| 223 | |
| 224 \noindent | |
| 225 or | |
| 226 | |
| 227 \begin{alltt} | |
| 228 a = a + b; | |
| 229 \end{alltt} | |
| 230 | |
| 231 \noindent | |
| 232 is written, using libzahl, as | |
| 233 | |
| 234 \begin{alltt} | |
| 235 zadd(a, a, b); | |
| 236 \end{alltt} | |
| 237 | |
| 238 For commutative functions, like {\tt zadd}, the | |
| 239 implementation is optimised to assume that this | |
| 240 order is more likely to be used than the alternative. | |
| 241 That is, we should, for example, write | |
| 242 | |
| 243 \begin{alltt} | |
| 244 zadd(a, a, b); | |
| 245 \end{alltt} | |
| 246 | |
| 247 \noindent | |
| 248 rather than | |
| 249 | |
| 250 \begin{alltt} | |
| 251 zadd(a, b, a); | |
| 252 \end{alltt} | |
| 253 | |
| 254 \noindent | |
| 255 This assumption is not made for non-commutative | |
| 256 functions. | |
| 257 | |
| 258 When writting your own functions, be aware, | |
| 259 input parameters are generally not declared {\tt const} | |
| 260 in libzahl. Currently, some functions actually make | |
| 261 modifications (that do not affect the value) to | |
| 262 input parameters. |