source: Daodan/src/dSFMT/html/howto-compile.html@ 599

Last change on this file since 599 was 440, checked in by rossy, 15 years ago

int32rand

File size: 10.8 KB
Line 
1<?xml version="1.0" encoding="UTF-8" ?>
2<!DOCTYPE html
3 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
4 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5<html xmlns="http://www.w3.org/1999/xhtml">
6 <head>
7 <meta http-equiv="Content-Type" content="text/html" />
8 <title>How to compile dSFMT</title>
9 <style type="text/css">
10 BLOCKQUOTE {background-color:#a0ffa0;
11 padding-left: 1em;}
12 </style>
13 </head>
14 <body>
15 <h2> How to compile dSFMT</h2>
16
17 <p>
18 This document explains how to compile dSFMT for users who
19 are using UNIX like systems (for example Linux, Free BSD,
20 cygwin, osx, etc) on terminal. I can't help those who use IDE
21 (Integrated Development Environment,) please see your IDE's help
22 to use SIMD feature of your CPU.
23 </p>
24
25 <h3>1. First Step: Compile test programs using Makefile.</h3>
26 <h4>1-1. Compile standard C test program.</h4>
27 <p>
28 Check if dSFMT.c and Makefile are in your current directory.
29 If not, <strong>cd</strong> to the directory where they exist.
30 Then, type
31 </p>
32 <blockquote>
33 <pre>make std</pre>
34 </blockquote>
35 <p>
36 If it causes an error, try to type
37 </p>
38 <blockquote>
39 <pre>cc -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c</pre>
40 </blockquote>
41 <p>
42 or try to type
43 </p>
44 <blockquote>
45 <pre>gcc -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c</pre>
46 </blockquote>
47 <p>
48 If success, then check the test program. Type
49 </p>
50 <blockquote>
51 <pre>./test-std-M19937 -v</pre>
52 </blockquote>
53 <p>
54 You will see many random numbers displayed on your screen.
55 If you want to check these random numbers are correct output,
56 redirect output to a file and <strong>diff</strong> it with
57 <strong>dSFMT.19937.out.txt</strong>, like this:</p>
58 <blockquote>
59 <pre>./test-std-M19937 -v > foo.txt
60diff -w foo.txt dSFMT.19937.out.txt</pre>
61 </blockquote>
62 <p>
63 Silence means they are the same because <strong>diff</strong>
64 reports the difference of two files.
65 </p>
66 <p>
67 If you want to know the generation speed of dSFMT, type
68 </p>
69 <blockquote>
70 <pre>./test-std-M19937 -s</pre>
71 </blockquote>
72 <p>
73 It is very slow. To make it fast, compile it
74 with <strong>-O3</strong> option. If your compiler is gcc, you
75 should specify <strong>-fno-strict-aliasing</strong> option
76 with <strong>-O3</strong>. type
77 </p>
78 <blockquote>
79 <pre>gcc -O3 -fno-strict-aliasing -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c
80./test-std-M19937 -s</pre>
81 </blockquote>
82 <p>
83 If you are using gcc 4.0, you will get more performance of dSFMT
84 by giving additional options
85 <strong>--param max-inline-insns-single=1800</strong>,
86 <strong>--param inline-unit-growth=500</strong> and
87 <strong>--param large-function-growth=900</strong>.
88 </p>
89
90 <h4>1-2. Compile SSE2 test program.</h4>
91 <p>
92 If your CPU supports SSE2 and you can use gcc version 3.4 or later,
93 you can make test-sse2-M19937. To do this, type
94 </p>
95 <blockquote>
96 <pre>make sse2</pre>
97 </blockquote>
98 <p>or type</p>
99 <blockquote>
100 <pre>gcc -O3 -msse2 -fno-strict-aliasing -DHAVE_SSE2=1 -DDSFMT_MEXP=19937 -o test-sse2-M19937 dSFMT.c test.c</pre>
101 </blockquote>
102 <p>If everything works well,</p>
103 <blockquote>
104 <pre>./test-sse2-M19937 -s</pre>
105 </blockquote>
106 <p>shows much shorter time than <strong>test-std-M19937 -s</strong>.</p>
107
108 <h4>1-3. Compile AltiVec test program.</h4>
109 <p>
110 If you are using Macintosh computer with PowerPC G4 or G5, and
111 your gcc version is later 3.3, you can make test-alti-M19937. To
112 do this, type
113 </p>
114 <blockquote>
115 <pre>make osx-alti</pre>
116 </blockquote>
117 <p>or type</p>
118 <blockquote>
119 <pre>gcc -O3 -faltivec -fno-strict-aliasing -DHAVE_ALTIVEC=1 -DDSFMT_MEXP=19937 -o test-alti-M19937 dSFMT.c test.c</pre>
120 </blockquote>
121 <p>If everything works well,</p>
122 <blockquote>
123 <pre>./test-alti-M19937 -s</pre>
124 </blockquote>
125 <p>shows much shorter time than <strong>test-std-M19937 -s</strong>.</p>
126
127 <h4>1-4. Compile and check output automatically.</h4>
128 <p>
129 To make test program and check output
130 automatically for all supported SFMT_MEXPs of dSFMT, type
131 </p>
132 <blockquote>
133 <pre>make std-check</pre>
134 </blockquote>
135 <p>
136 To check test program optimized for SSE2, type
137 </p>
138 <blockquote>
139 <pre>make sse2-check</pre>
140 </blockquote>
141 <p>
142 To check test program optimized for OSX PowerPC AltiVec, type
143 </p>
144 <blockquote>
145 <pre>make osx-alti-check</pre>
146 </blockquote>
147 <p>
148 These commands may take some time.
149 </p>
150
151 <h3>2. Second Step: Use dSFMT pseudorandom number generator with
152 your C program.</h3>
153 <h4>2-1. Use sequential call and static link.</h4>
154 <p>
155 Here is a very simple program <strong>sample1.c</strong> which
156 calculates PI using Monte-Carlo method.
157 </p>
158 <blockquote>
159 <pre>
160#include &lt;stdio.h&gt;
161#include &lt;stdlib.h&gt;
162#include "dSFMT.h"
163
164int main(int argc, char* argv[]) {
165 int i, cnt, seed;
166 double x, y, pi;
167 const int NUM = 10000;
168 dsfmt_t dsfmt;
169
170 if (argc &gt;= 2) {
171 seed = strtol(argv[1], NULL, 10);
172 } else {
173 seed = 12345;
174 }
175 cnt = 0;
176 dsfmt_init_gen_rand(&amp;dsfmt, seed);
177 for (i = 0; i &lt; NUM; i++) {
178 x = dsfmt_genrand_close_open(&amp;dsfmt);
179 y = dsfmt_genrand_close_open(&amp;dsfmt);
180 if (x * x + y * y &lt; 1.0) {
181 cnt++;
182 }
183 }
184 pi = (double)cnt / NUM * 4;
185 printf("%f\n", pi);
186 return 0;
187}
188 </pre>
189 </blockquote>
190 <p>To compile <strong>sample1.c</strong> with dSFMT.c with the period of
191 2<sup>607</sup>, type</p>
192 <blockquote>
193 <pre>gcc -DDSFMT_MEXP=521 -o sample1 dSFMT.c sample1.c</pre>
194 </blockquote>
195 <p>If your CPU supports SSE2 and you want to use optimized dSFMT for
196 SSE2, type</p>
197 <blockquote>
198 <pre>gcc -msse2 -DDSFMT_MEXP=521 -DHAVE_SSE2 -o sample1 dSFMT.c sample1.c</pre>
199 </blockquote>
200 <p>If your Computer is Apple PowerPC G4 or G5 and you want to use
201 optimized dSFMT for AltiVec, type</p>
202 <blockquote>
203 <pre>gcc -faltivec -DDSFMT_MEXP=521 -DHAVE_ALTIVEC -o sample1 dSFMT.c sample1.c</pre>
204 </blockquote>
205
206 <h4>2-2. Use block call and static link.</h4>
207 <p>
208 Here is <strong>sample2.c</strong> which modifies sample1.c.
209 The block call <strong>dsfmt_fill_array_close_open</strong> is
210 much faster than sequential call, but it needs an aligned
211 memory. The standard function to get an aligned memory
212 is <strong>posix_memalign</strong>, but it isn't usable in every
213 OS.
214 </p>
215 <blockquote>
216 <pre>
217#include &lt;stdio.h&gt;
218#define _XOPEN_SOURCE 600
219#include &lt;stdlib.h&gt;
220#include "dSFMT.h"
221
222int main(int argc, char* argv[]) {
223 int i, j, cnt, seed;
224 double x, y, pi;
225 const int NUM = 10000;
226 const int R_SIZE = 2 * NUM;
227 int size;
228 double *array;
229 dsfmt_t dsfmt;
230
231 if (argc &gt;= 2) {
232 seed = strtol(argv[1], NULL, 10);
233 } else {
234 seed = 12345;
235 }
236 size = dsfmt_get_min_array_size();
237 if (size &lt; R_SIZE) {
238 size = R_SIZE;
239 }
240#if defined(__APPLE__) || \
241 (defined(__FreeBSD__) &amp;&amp; __FreeBSD__ &gt;= 3 &amp;&amp; __FreeBSD__ &lt;= 6)
242 printf("malloc used\n");
243 array = malloc(sizeof(double) * size);
244 if (array == NULL) {
245 printf("can't allocate memory.\n");
246 return 1;
247 }
248#elif defined(_POSIX_C_SOURCE)
249 printf("posix_memalign used\n");
250 if (posix_memalign((void **)&amp;array, 16, sizeof(double) * size) != 0) {
251 printf("can't allocate memory.\n");
252 return 1;
253 }
254#elif defined(__GNUC__) &amp;&amp; (__GNUC__ &gt; 3 || (__GNUC__ == 3 &amp;&amp; __GNUC_MINOR__ &gt;= 3))
255 printf("memalign used\n");
256 array = memalign(16, sizeof(double) * size);
257 if (array == NULL) {
258 printf("can't allocate memory.\n");
259 return 1;
260 }
261#else /* in this case, gcc doesn't suppport SSE2 */
262 array = malloc(sizeof(double) * size);
263 if (array == NULL) {
264 printf("can't allocate memory.\n");
265 return 1;
266 }
267#endif
268 cnt = 0;
269 j = 0;
270 dsfmt_init_gen_rand(&amp;dsfmt, seed);
271 dsfmt_fill_array_close_open(&amp;dsfmt, array, size);
272 for (i = 0; i &lt; NUM; i++) {
273 x = array[j++];
274 y = array[j++];
275 if (x * x + y * y &lt; 1.0) {
276 cnt++;
277 }
278 }
279 free(array);
280 pi = (double)cnt / NUM * 4;
281 printf("%f\n", pi);
282 return 0;
283}
284 </pre>
285 </blockquote>
286 <p>To compile <strong>sample2.c</strong> with dSFMT.c with the period of
287 2<sup>2281</sup>, type</p>
288 <blockquote>
289 <pre>gcc -DDSFMT_MEXP=2203 -o sample2 dSFMT.c sample2.c</pre>
290 </blockquote>
291 <p>If your CPU supports SSE2 and you want to use optimized dSFMT for
292 SSE2, type</p>
293 <blockquote>
294 <pre>gcc -msse2 -DDSFMT_MEXP=2203 -DHAVE_SSE2 -o sample2 dSFMT.c sample2.c</pre>
295 </blockquote>
296 <p>If your computer is Apple PowerPC G4 or G5 and you want to use
297 optimized dSFMT for AltiVec, type</p>
298 <blockquote>
299 <pre>gcc -faltivec -DDSFMT_MEXP=2203 -DHAVE_ALTIVEC -o sample2 dSFMT.c sample2.c</pre>
300 </blockquote>
301 <h4>2-3. Initialize dSFMT using dsfmt_init_by_array function.</h4>
302 <p>
303 Here is <strong>sample3.c</strong> which modifies sample1.c.
304 The 32-bit integer seed can only make 2<sup>32</sup> kinds of
305 initial state, to avoid this problem, dSFMT
306 provides <strong>dsfmt_init_by_array</strong> function. This sample
307 uses dsfmt_init_by_array function which initialize the internal state
308 array with an array of 32-bit. The size of an array can be
309 larger than the internal state array and all elements of the
310 array are used for initialization, but too large array is
311 wasteful.
312 </p>
313 <blockquote>
314 <pre>
315#include &lt;stdio.h&gt;
316#include &lt;string.h&gt;
317#include "dSFMT.h"
318
319int main(int argc, char* argv[]) {
320 int i, cnt, seed_cnt;
321 double x, y, pi;
322 const int NUM = 10000;
323 uint32_t seeds[100];
324 dsfmt_t dsfmt;
325
326 if (argc &gt;= 2) {
327 seed_cnt = 0;
328 for (i = 0; (i &lt; 100) &amp;&amp; (i &lt; strlen(argv[1])); i++) {
329 seeds[i] = argv[1][i];
330 seed_cnt++;
331 }
332 } else {
333 seeds[0] = 12345;
334 seed_cnt = 1;
335 }
336 cnt = 0;
337 dsfmt_init_by_array(&amp;dsfmt, seeds, seed_cnt);
338 for (i = 0; i &lt; NUM; i++) {
339 x = dsfmt_genrand_close_open(&amp;dsfmt);
340 y = dsfmt_genrand_close_open(&amp;dsfmt);
341 if (x * x + y * y &lt; 1.0) {
342 cnt++;
343 }
344 }
345 pi = (double)cnt / NUM * 4;
346 printf("%f\n", pi);
347 return 0;
348}
349 </pre>
350 </blockquote>
351 <p>To compile <strong>sample3.c</strong>, type</p>
352 <blockquote>
353 <pre>gcc -DDSFMT_MEXP=1279 -o sample3 dSFMT.c sample3.c</pre>
354 </blockquote>
355 <p>Now, seed can be a string. Like this:</p>
356 <blockquote>
357 <pre>./sample3 your-full-name</pre>
358 </blockquote>
359 </body>
360</html>
Note: See TracBrowser for help on using the repository browser.