libzahls-design.tex - libzahl - big integer library | |
git clone git://git.suckless.org/libzahl | |
Log | |
Files | |
Refs | |
README | |
LICENSE | |
--- | |
libzahls-design.tex (7474B) | |
--- | |
1 \chapter{libzahl's design} | |
2 \label{chap:libzahl's design} | |
3 | |
4 In this chapter, the design of libzahl is discussed. | |
5 | |
6 \vspace{1cm} | |
7 \minitoc | |
8 | |
9 | |
10 \newpage | |
11 \section{Memory pool} | |
12 \label{sec:Memory pool} | |
13 | |
14 Allocating memory dynamically is an expensive operation. | |
15 To improve performance, libzahl never deallocates memory | |
16 before the library is uninitialised, instead it pools | |
17 memory, that is no longer needed, for reuse. | |
18 | |
19 Because of the memory pooling, this is a pattern to the | |
20 allocation sizes. In an allocation, a power of two | |
21 elements, plus a few elements that are discussed in | |
22 \secref{sec:Integer structure}, are allocated. That is, | |
23 the number multiplied by the size of an element. | |
24 Powers of two (growth factor 2) is not the most memory | |
25 efficient way to do this, but it is the simplest and | |
26 performance efficient. This power of two (sans the few | |
27 extra elements) is used to calculate — getting the index | |
28 of the only set bit — the index of the bucket in | |
29 which the allocation is stored when pooled. The buckets | |
30 are dynamic arrays with the growth factor 1.5. The | |
31 growth factor 1.5 is often used for dynamic arrays, it | |
32 is a good compromise between memory usage and performance. | |
33 | |
34 libzahl also avoids allocating memory by having a set | |
35 of temporary variables predefined. | |
36 | |
37 | |
38 \newpage | |
39 \section{Error handling} | |
40 \label{sec:Error handling} | |
41 | |
42 In C, it is traditional to return a sentiel value | |
43 in case an error has occurred, and set the value | |
44 of a global variable to describe the error that | |
45 has occurred. The programmer can choose whether to | |
46 check for errors, ignore errors where it does not | |
47 matter, or simple ignore errors altogether and let | |
48 the program eventually crash. This is a simple | |
49 technique that gives the programmer a better | |
50 understanding of what can happen. A great advantage | |
51 C has over most programming languages. | |
52 | |
53 Another technique is to use long jumps on error. | |
54 This technique is not too common, but is has one | |
55 significant advantage. Error-checks need only be | |
56 preformed where the error can first be detected. | |
57 There is no need to check the return value at every | |
58 function return. This leads to cleaner code, if | |
59 there are many functions that can raise exceptional | |
60 conditions, and greater performance under some | |
61 conditions. This is why this technique is sometimes | |
62 used in high-performance libraries. libzahl uses | |
63 this technique. | |
64 | |
65 Rather than writing | |
66 | |
67 \begin{alltt} | |
68 if (zadd(a, b, c)) | |
69 goto out; | |
70 \end{alltt} | |
71 | |
72 \noindent | |
73 or a bit cleaner, if there are a lot of calls, | |
74 | |
75 \begin{alltt} | |
76 #define TRY(...) do if (__VA_ARGS__) goto out; while (0) | |
77 \textcolor{c}{/* \textrm{\ldots} */} | |
78 TRY(zadd(a, b, c)); | |
79 \end{alltt} | |
80 | |
81 \noindent | |
82 we write | |
83 | |
84 \begin{alltt} | |
85 jmp_buf env; | |
86 if (setjmp(env)) | |
87 goto out; | |
88 zsetup(env); | |
89 \textcolor{c}{/* \textrm{\ldots} */} | |
90 zadd(a, b, c); | |
91 \end{alltt} | |
92 | |
93 You only need to call {\tt setjmp} and {\tt zsetup} | |
94 once, but can update the return point by calling | |
95 them once more. | |
96 | |
97 If you don't need to check for errors, you can | |
98 disable error detection at compile-time. By defining | |
99 the {\tt ZAHL\_UNSAFE} C preprocessor definition | |
100 when compiling libzahl, and when compiling your | |
101 software that uses libzahl. | |
102 | |
103 | |
104 \newpage | |
105 \section{Integer structure} | |
106 \label{sec:Integer structure} | |
107 | |
108 The data type used to represent a big integer with | |
109 libzahl is {\tt z\_t},\footnote{This name actually | |
110 violates the naming convention; it should be {\tt Z}, | |
111 or {\tt Zahl} to avoid single-letter names. But this | |
112 violation is common-place.} defined as | |
113 | |
114 \begin{alltt} | |
115 typedef struct zahl z_t[1]; | |
116 \end{alltt} | |
117 | |
118 \noindent | |
119 where {\tt struct zahl} is defined as | |
120 | |
121 \begin{alltt} | |
122 struct zahl \{ | |
123 int sign; \textcolor{c}{/* \textrm{\emph{not} short fo… | |
124 size_t used; | |
125 size_t alloced; \textcolor{c}{/* \textrm{short for `allocate… | |
126 zahl_char_t *chars; \textcolor{c}{/* \textrm{short for `characte… | |
127 \}; | |
128 \end{alltt} | |
129 | |
130 \noindent | |
131 where {\tt zahl\_char\_t} is defined as | |
132 | |
133 \begin{alltt} | |
134 typedef uint64_t zahl_char_t; | |
135 \end{alltt} | |
136 | |
137 \noindent | |
138 As a user, try not to think about anything else than | |
139 | |
140 \begin{alltt} | |
141 typedef \textcolor{c}{/* \textrm{ignore what is here} */} z_t[1]; | |
142 \end{alltt} | |
143 | |
144 \noindent | |
145 details can change in future versions of libzahl. | |
146 | |
147 {\tt z\_t} is defined as a single-element array. This | |
148 is often called a reference, or a call-by-reference. | |
149 There are some flexibility issues with this, why | |
150 {\tt struct zahl} has beed added, but for most uses | |
151 with big integers, it makes things simpler. Particularly, | |
152 you need not work prepend {\tt \&} to variable when making | |
153 function calls, but the existence of {\tt struct zahl} | |
154 allows you do so if you so choose. | |
155 | |
156 The {\tt .sign} member, is either $-1$, 0, or 1, | |
157 when the integer is negative, zero, or positive, | |
158 respectively. Whenever {\tt .sign} is 0, the value | |
159 of {\tt .used} and {\tt .chars} are undefined. | |
160 | |
161 {\tt .used} holds to the number of elements used in | |
162 {\tt .chars}, and {\tt .alloced} holds the allocation | |
163 side of {\tt .chars} measured in elements minus a few | |
164 extra elements that are always added to the allocation. | |
165 {\tt .chars} is a little-endian array of 64-bit digits, | |
166 these 64-bit digits are called `characters' in libzahl. | |
167 {\tt .chars} holds the absolute value of the | |
168 represented value. | |
169 | |
170 Unless {\tt .sign} is 0, {\tt .chars} always contains | |
171 four extra elements, refered to as fluff. These are | |
172 merely allocated so functions can assume that they can | |
173 always manipulate groups of four characters, and need | |
174 not care about cases where the number of characters is | |
175 not a multiple of four. There are of course a few cases | |
176 when the precise number of characters is important. | |
177 | |
178 | |
179 \newpage | |
180 \section{Parameters} | |
181 \label{sec:Parameters} | |
182 | |
183 The general order of parameters in libzahl functions | |
184 are: output integers, input integers, input data, | |
185 output data, parametric values. For example, in | |
186 addition, the out parameter is the first parameter. | |
187 But for marshalling and unmarshalling the buffer | |
188 is last. For random number generation the order is: | |
189 output, device, distribution, distribution parameters. | |
190 Whilst the distribution parameters are big integers, | |
191 they are not considered input integers. The order | |
192 of the input parameters are that of the order you | |
193 would write them using mathematical notation, this | |
194 also holds true if you include the output parameter | |
195 (as long as there is exactly one output), for example | |
196 | |
197 \vspace{1em} | |
198 $a \gets b^c \mod d$ | |
199 \vspace{1em} | |
200 | |
201 \noindent | |
202 is written | |
203 | |
204 \begin{verbatim} | |
205 zmodpow(a, b, c, d); | |
206 \end{verbatim} | |
207 | |
208 \noindent | |
209 or | |
210 | |
211 \begin{verbatim} | |
212 zmodpowu(a, b, c, d); | |
213 \end{verbatim} | |
214 | |
215 Like any self respecting bignum library, libzahl | |
216 supports using the same big integer reference as | |
217 for output as input, as long as all the output | |
218 parameters are mutually unique. For example | |
219 | |
220 \begin{alltt} | |
221 a += b; | |
222 \end{alltt} | |
223 | |
224 \noindent | |
225 or | |
226 | |
227 \begin{alltt} | |
228 a = a + b; | |
229 \end{alltt} | |
230 | |
231 \noindent | |
232 is written, using libzahl, as | |
233 | |
234 \begin{alltt} | |
235 zadd(a, a, b); | |
236 \end{alltt} | |
237 | |
238 For commutative functions, like {\tt zadd}, the | |
239 implementation is optimised to assume that this | |
240 order is more likely to be used than the alternative. | |
241 That is, we should, for example, write | |
242 | |
243 \begin{alltt} | |
244 zadd(a, a, b); | |
245 \end{alltt} | |
246 | |
247 \noindent | |
248 rather than | |
249 | |
250 \begin{alltt} | |
251 zadd(a, b, a); | |
252 \end{alltt} | |
253 | |
254 \noindent | |
255 This assumption is not made for non-commutative | |
256 functions. | |
257 | |
258 When writting your own functions, be aware, | |
259 input parameters are generally not declared {\tt const} | |
260 in libzahl. Currently, some functions actually make | |
261 modifications (that do not affect the value) to | |
262 input parameters. |