1 | <?xml version="1.0" encoding="UTF-8" ?>
|
---|
2 | <!DOCTYPE html
|
---|
3 | PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
---|
4 | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
---|
5 | <html xmlns="http://www.w3.org/1999/xhtml">
|
---|
6 | <head>
|
---|
7 | <meta http-equiv="Content-Type" content="text/html" />
|
---|
8 | <title>How to compile dSFMT</title>
|
---|
9 | <style type="text/css">
|
---|
10 | BLOCKQUOTE {background-color:#a0ffa0;
|
---|
11 | padding-left: 1em;}
|
---|
12 | </style>
|
---|
13 | </head>
|
---|
14 | <body>
|
---|
15 | <h2> How to compile dSFMT</h2>
|
---|
16 |
|
---|
17 | <p>
|
---|
18 | This document explains how to compile dSFMT for users who
|
---|
19 | are using UNIX like systems (for example Linux, Free BSD,
|
---|
20 | cygwin, osx, etc) on terminal. I can't help those who use IDE
|
---|
21 | (Integrated Development Environment,) please see your IDE's help
|
---|
22 | to use SIMD feature of your CPU.
|
---|
23 | </p>
|
---|
24 |
|
---|
25 | <h3>1. First Step: Compile test programs using Makefile.</h3>
|
---|
26 | <h4>1-1. Compile standard C test program.</h4>
|
---|
27 | <p>
|
---|
28 | Check if dSFMT.c and Makefile are in your current directory.
|
---|
29 | If not, <strong>cd</strong> to the directory where they exist.
|
---|
30 | Then, type
|
---|
31 | </p>
|
---|
32 | <blockquote>
|
---|
33 | <pre>make std</pre>
|
---|
34 | </blockquote>
|
---|
35 | <p>
|
---|
36 | If it causes an error, try to type
|
---|
37 | </p>
|
---|
38 | <blockquote>
|
---|
39 | <pre>cc -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c</pre>
|
---|
40 | </blockquote>
|
---|
41 | <p>
|
---|
42 | or try to type
|
---|
43 | </p>
|
---|
44 | <blockquote>
|
---|
45 | <pre>gcc -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c</pre>
|
---|
46 | </blockquote>
|
---|
47 | <p>
|
---|
48 | If success, then check the test program. Type
|
---|
49 | </p>
|
---|
50 | <blockquote>
|
---|
51 | <pre>./test-std-M19937 -v</pre>
|
---|
52 | </blockquote>
|
---|
53 | <p>
|
---|
54 | You will see many random numbers displayed on your screen.
|
---|
55 | If you want to check these random numbers are correct output,
|
---|
56 | redirect output to a file and <strong>diff</strong> it with
|
---|
57 | <strong>dSFMT.19937.out.txt</strong>, like this:</p>
|
---|
58 | <blockquote>
|
---|
59 | <pre>./test-std-M19937 -v > foo.txt
|
---|
60 | diff -w foo.txt dSFMT.19937.out.txt</pre>
|
---|
61 | </blockquote>
|
---|
62 | <p>
|
---|
63 | Silence means they are the same because <strong>diff</strong>
|
---|
64 | reports the difference of two files.
|
---|
65 | </p>
|
---|
66 | <p>
|
---|
67 | If you want to know the generation speed of dSFMT, type
|
---|
68 | </p>
|
---|
69 | <blockquote>
|
---|
70 | <pre>./test-std-M19937 -s</pre>
|
---|
71 | </blockquote>
|
---|
72 | <p>
|
---|
73 | It is very slow. To make it fast, compile it
|
---|
74 | with <strong>-O3</strong> option. If your compiler is gcc, you
|
---|
75 | should specify <strong>-fno-strict-aliasing</strong> option
|
---|
76 | with <strong>-O3</strong>. type
|
---|
77 | </p>
|
---|
78 | <blockquote>
|
---|
79 | <pre>gcc -O3 -fno-strict-aliasing -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c
|
---|
80 | ./test-std-M19937 -s</pre>
|
---|
81 | </blockquote>
|
---|
82 | <p>
|
---|
83 | If you are using gcc 4.0, you will get more performance of dSFMT
|
---|
84 | by giving additional options
|
---|
85 | <strong>--param max-inline-insns-single=1800</strong>,
|
---|
86 | <strong>--param inline-unit-growth=500</strong> and
|
---|
87 | <strong>--param large-function-growth=900</strong>.
|
---|
88 | </p>
|
---|
89 |
|
---|
90 | <h4>1-2. Compile SSE2 test program.</h4>
|
---|
91 | <p>
|
---|
92 | If your CPU supports SSE2 and you can use gcc version 3.4 or later,
|
---|
93 | you can make test-sse2-M19937. To do this, type
|
---|
94 | </p>
|
---|
95 | <blockquote>
|
---|
96 | <pre>make sse2</pre>
|
---|
97 | </blockquote>
|
---|
98 | <p>or type</p>
|
---|
99 | <blockquote>
|
---|
100 | <pre>gcc -O3 -msse2 -fno-strict-aliasing -DHAVE_SSE2=1 -DDSFMT_MEXP=19937 -o test-sse2-M19937 dSFMT.c test.c</pre>
|
---|
101 | </blockquote>
|
---|
102 | <p>If everything works well,</p>
|
---|
103 | <blockquote>
|
---|
104 | <pre>./test-sse2-M19937 -s</pre>
|
---|
105 | </blockquote>
|
---|
106 | <p>shows much shorter time than <strong>test-std-M19937 -s</strong>.</p>
|
---|
107 |
|
---|
108 | <h4>1-3. Compile AltiVec test program.</h4>
|
---|
109 | <p>
|
---|
110 | If you are using Macintosh computer with PowerPC G4 or G5, and
|
---|
111 | your gcc version is later 3.3, you can make test-alti-M19937. To
|
---|
112 | do this, type
|
---|
113 | </p>
|
---|
114 | <blockquote>
|
---|
115 | <pre>make osx-alti</pre>
|
---|
116 | </blockquote>
|
---|
117 | <p>or type</p>
|
---|
118 | <blockquote>
|
---|
119 | <pre>gcc -O3 -faltivec -fno-strict-aliasing -DHAVE_ALTIVEC=1 -DDSFMT_MEXP=19937 -o test-alti-M19937 dSFMT.c test.c</pre>
|
---|
120 | </blockquote>
|
---|
121 | <p>If everything works well,</p>
|
---|
122 | <blockquote>
|
---|
123 | <pre>./test-alti-M19937 -s</pre>
|
---|
124 | </blockquote>
|
---|
125 | <p>shows much shorter time than <strong>test-std-M19937 -s</strong>.</p>
|
---|
126 |
|
---|
127 | <h4>1-4. Compile and check output automatically.</h4>
|
---|
128 | <p>
|
---|
129 | To make test program and check output
|
---|
130 | automatically for all supported SFMT_MEXPs of dSFMT, type
|
---|
131 | </p>
|
---|
132 | <blockquote>
|
---|
133 | <pre>make std-check</pre>
|
---|
134 | </blockquote>
|
---|
135 | <p>
|
---|
136 | To check test program optimized for SSE2, type
|
---|
137 | </p>
|
---|
138 | <blockquote>
|
---|
139 | <pre>make sse2-check</pre>
|
---|
140 | </blockquote>
|
---|
141 | <p>
|
---|
142 | To check test program optimized for OSX PowerPC AltiVec, type
|
---|
143 | </p>
|
---|
144 | <blockquote>
|
---|
145 | <pre>make osx-alti-check</pre>
|
---|
146 | </blockquote>
|
---|
147 | <p>
|
---|
148 | These commands may take some time.
|
---|
149 | </p>
|
---|
150 |
|
---|
151 | <h3>2. Second Step: Use dSFMT pseudorandom number generator with
|
---|
152 | your C program.</h3>
|
---|
153 | <h4>2-1. Use sequential call and static link.</h4>
|
---|
154 | <p>
|
---|
155 | Here is a very simple program <strong>sample1.c</strong> which
|
---|
156 | calculates PI using Monte-Carlo method.
|
---|
157 | </p>
|
---|
158 | <blockquote>
|
---|
159 | <pre>
|
---|
160 | #include <stdio.h>
|
---|
161 | #include <stdlib.h>
|
---|
162 | #include "dSFMT.h"
|
---|
163 |
|
---|
164 | int main(int argc, char* argv[]) {
|
---|
165 | int i, cnt, seed;
|
---|
166 | double x, y, pi;
|
---|
167 | const int NUM = 10000;
|
---|
168 | dsfmt_t dsfmt;
|
---|
169 |
|
---|
170 | if (argc >= 2) {
|
---|
171 | seed = strtol(argv[1], NULL, 10);
|
---|
172 | } else {
|
---|
173 | seed = 12345;
|
---|
174 | }
|
---|
175 | cnt = 0;
|
---|
176 | dsfmt_init_gen_rand(&dsfmt, seed);
|
---|
177 | for (i = 0; i < NUM; i++) {
|
---|
178 | x = dsfmt_genrand_close_open(&dsfmt);
|
---|
179 | y = dsfmt_genrand_close_open(&dsfmt);
|
---|
180 | if (x * x + y * y < 1.0) {
|
---|
181 | cnt++;
|
---|
182 | }
|
---|
183 | }
|
---|
184 | pi = (double)cnt / NUM * 4;
|
---|
185 | printf("%f\n", pi);
|
---|
186 | return 0;
|
---|
187 | }
|
---|
188 | </pre>
|
---|
189 | </blockquote>
|
---|
190 | <p>To compile <strong>sample1.c</strong> with dSFMT.c with the period of
|
---|
191 | 2<sup>607</sup>, type</p>
|
---|
192 | <blockquote>
|
---|
193 | <pre>gcc -DDSFMT_MEXP=521 -o sample1 dSFMT.c sample1.c</pre>
|
---|
194 | </blockquote>
|
---|
195 | <p>If your CPU supports SSE2 and you want to use optimized dSFMT for
|
---|
196 | SSE2, type</p>
|
---|
197 | <blockquote>
|
---|
198 | <pre>gcc -msse2 -DDSFMT_MEXP=521 -DHAVE_SSE2 -o sample1 dSFMT.c sample1.c</pre>
|
---|
199 | </blockquote>
|
---|
200 | <p>If your Computer is Apple PowerPC G4 or G5 and you want to use
|
---|
201 | optimized dSFMT for AltiVec, type</p>
|
---|
202 | <blockquote>
|
---|
203 | <pre>gcc -faltivec -DDSFMT_MEXP=521 -DHAVE_ALTIVEC -o sample1 dSFMT.c sample1.c</pre>
|
---|
204 | </blockquote>
|
---|
205 |
|
---|
206 | <h4>2-2. Use block call and static link.</h4>
|
---|
207 | <p>
|
---|
208 | Here is <strong>sample2.c</strong> which modifies sample1.c.
|
---|
209 | The block call <strong>dsfmt_fill_array_close_open</strong> is
|
---|
210 | much faster than sequential call, but it needs an aligned
|
---|
211 | memory. The standard function to get an aligned memory
|
---|
212 | is <strong>posix_memalign</strong>, but it isn't usable in every
|
---|
213 | OS.
|
---|
214 | </p>
|
---|
215 | <blockquote>
|
---|
216 | <pre>
|
---|
217 | #include <stdio.h>
|
---|
218 | #define _XOPEN_SOURCE 600
|
---|
219 | #include <stdlib.h>
|
---|
220 | #include "dSFMT.h"
|
---|
221 |
|
---|
222 | int main(int argc, char* argv[]) {
|
---|
223 | int i, j, cnt, seed;
|
---|
224 | double x, y, pi;
|
---|
225 | const int NUM = 10000;
|
---|
226 | const int R_SIZE = 2 * NUM;
|
---|
227 | int size;
|
---|
228 | double *array;
|
---|
229 | dsfmt_t dsfmt;
|
---|
230 |
|
---|
231 | if (argc >= 2) {
|
---|
232 | seed = strtol(argv[1], NULL, 10);
|
---|
233 | } else {
|
---|
234 | seed = 12345;
|
---|
235 | }
|
---|
236 | size = dsfmt_get_min_array_size();
|
---|
237 | if (size < R_SIZE) {
|
---|
238 | size = R_SIZE;
|
---|
239 | }
|
---|
240 | #if defined(__APPLE__) || \
|
---|
241 | (defined(__FreeBSD__) && __FreeBSD__ >= 3 && __FreeBSD__ <= 6)
|
---|
242 | printf("malloc used\n");
|
---|
243 | array = malloc(sizeof(double) * size);
|
---|
244 | if (array == NULL) {
|
---|
245 | printf("can't allocate memory.\n");
|
---|
246 | return 1;
|
---|
247 | }
|
---|
248 | #elif defined(_POSIX_C_SOURCE)
|
---|
249 | printf("posix_memalign used\n");
|
---|
250 | if (posix_memalign((void **)&array, 16, sizeof(double) * size) != 0) {
|
---|
251 | printf("can't allocate memory.\n");
|
---|
252 | return 1;
|
---|
253 | }
|
---|
254 | #elif defined(__GNUC__) && (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 3))
|
---|
255 | printf("memalign used\n");
|
---|
256 | array = memalign(16, sizeof(double) * size);
|
---|
257 | if (array == NULL) {
|
---|
258 | printf("can't allocate memory.\n");
|
---|
259 | return 1;
|
---|
260 | }
|
---|
261 | #else /* in this case, gcc doesn't suppport SSE2 */
|
---|
262 | array = malloc(sizeof(double) * size);
|
---|
263 | if (array == NULL) {
|
---|
264 | printf("can't allocate memory.\n");
|
---|
265 | return 1;
|
---|
266 | }
|
---|
267 | #endif
|
---|
268 | cnt = 0;
|
---|
269 | j = 0;
|
---|
270 | dsfmt_init_gen_rand(&dsfmt, seed);
|
---|
271 | dsfmt_fill_array_close_open(&dsfmt, array, size);
|
---|
272 | for (i = 0; i < NUM; i++) {
|
---|
273 | x = array[j++];
|
---|
274 | y = array[j++];
|
---|
275 | if (x * x + y * y < 1.0) {
|
---|
276 | cnt++;
|
---|
277 | }
|
---|
278 | }
|
---|
279 | free(array);
|
---|
280 | pi = (double)cnt / NUM * 4;
|
---|
281 | printf("%f\n", pi);
|
---|
282 | return 0;
|
---|
283 | }
|
---|
284 | </pre>
|
---|
285 | </blockquote>
|
---|
286 | <p>To compile <strong>sample2.c</strong> with dSFMT.c with the period of
|
---|
287 | 2<sup>2281</sup>, type</p>
|
---|
288 | <blockquote>
|
---|
289 | <pre>gcc -DDSFMT_MEXP=2203 -o sample2 dSFMT.c sample2.c</pre>
|
---|
290 | </blockquote>
|
---|
291 | <p>If your CPU supports SSE2 and you want to use optimized dSFMT for
|
---|
292 | SSE2, type</p>
|
---|
293 | <blockquote>
|
---|
294 | <pre>gcc -msse2 -DDSFMT_MEXP=2203 -DHAVE_SSE2 -o sample2 dSFMT.c sample2.c</pre>
|
---|
295 | </blockquote>
|
---|
296 | <p>If your computer is Apple PowerPC G4 or G5 and you want to use
|
---|
297 | optimized dSFMT for AltiVec, type</p>
|
---|
298 | <blockquote>
|
---|
299 | <pre>gcc -faltivec -DDSFMT_MEXP=2203 -DHAVE_ALTIVEC -o sample2 dSFMT.c sample2.c</pre>
|
---|
300 | </blockquote>
|
---|
301 | <h4>2-3. Initialize dSFMT using dsfmt_init_by_array function.</h4>
|
---|
302 | <p>
|
---|
303 | Here is <strong>sample3.c</strong> which modifies sample1.c.
|
---|
304 | The 32-bit integer seed can only make 2<sup>32</sup> kinds of
|
---|
305 | initial state, to avoid this problem, dSFMT
|
---|
306 | provides <strong>dsfmt_init_by_array</strong> function. This sample
|
---|
307 | uses dsfmt_init_by_array function which initialize the internal state
|
---|
308 | array with an array of 32-bit. The size of an array can be
|
---|
309 | larger than the internal state array and all elements of the
|
---|
310 | array are used for initialization, but too large array is
|
---|
311 | wasteful.
|
---|
312 | </p>
|
---|
313 | <blockquote>
|
---|
314 | <pre>
|
---|
315 | #include <stdio.h>
|
---|
316 | #include <string.h>
|
---|
317 | #include "dSFMT.h"
|
---|
318 |
|
---|
319 | int main(int argc, char* argv[]) {
|
---|
320 | int i, cnt, seed_cnt;
|
---|
321 | double x, y, pi;
|
---|
322 | const int NUM = 10000;
|
---|
323 | uint32_t seeds[100];
|
---|
324 | dsfmt_t dsfmt;
|
---|
325 |
|
---|
326 | if (argc >= 2) {
|
---|
327 | seed_cnt = 0;
|
---|
328 | for (i = 0; (i < 100) && (i < strlen(argv[1])); i++) {
|
---|
329 | seeds[i] = argv[1][i];
|
---|
330 | seed_cnt++;
|
---|
331 | }
|
---|
332 | } else {
|
---|
333 | seeds[0] = 12345;
|
---|
334 | seed_cnt = 1;
|
---|
335 | }
|
---|
336 | cnt = 0;
|
---|
337 | dsfmt_init_by_array(&dsfmt, seeds, seed_cnt);
|
---|
338 | for (i = 0; i < NUM; i++) {
|
---|
339 | x = dsfmt_genrand_close_open(&dsfmt);
|
---|
340 | y = dsfmt_genrand_close_open(&dsfmt);
|
---|
341 | if (x * x + y * y < 1.0) {
|
---|
342 | cnt++;
|
---|
343 | }
|
---|
344 | }
|
---|
345 | pi = (double)cnt / NUM * 4;
|
---|
346 | printf("%f\n", pi);
|
---|
347 | return 0;
|
---|
348 | }
|
---|
349 | </pre>
|
---|
350 | </blockquote>
|
---|
351 | <p>To compile <strong>sample3.c</strong>, type</p>
|
---|
352 | <blockquote>
|
---|
353 | <pre>gcc -DDSFMT_MEXP=1279 -o sample3 dSFMT.c sample3.c</pre>
|
---|
354 | </blockquote>
|
---|
355 | <p>Now, seed can be a string. Like this:</p>
|
---|
356 | <blockquote>
|
---|
357 | <pre>./sample3 your-full-name</pre>
|
---|
358 | </blockquote>
|
---|
359 | </body>
|
---|
360 | </html>
|
---|