[440] | 1 | <?xml version="1.0" encoding="UTF-8" ?>
|
---|
| 2 | <!DOCTYPE html
|
---|
| 3 | PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
---|
| 4 | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
---|
| 5 | <html xmlns="http://www.w3.org/1999/xhtml">
|
---|
| 6 | <head>
|
---|
| 7 | <meta http-equiv="Content-Type" content="text/html" />
|
---|
| 8 | <title>How to compile dSFMT</title>
|
---|
| 9 | <style type="text/css">
|
---|
| 10 | BLOCKQUOTE {background-color:#a0ffa0;
|
---|
| 11 | padding-left: 1em;}
|
---|
| 12 | </style>
|
---|
| 13 | </head>
|
---|
| 14 | <body>
|
---|
| 15 | <h2> How to compile dSFMT</h2>
|
---|
| 16 |
|
---|
| 17 | <p>
|
---|
| 18 | This document explains how to compile dSFMT for users who
|
---|
| 19 | are using UNIX like systems (for example Linux, Free BSD,
|
---|
| 20 | cygwin, osx, etc) on terminal. I can't help those who use IDE
|
---|
| 21 | (Integrated Development Environment,) please see your IDE's help
|
---|
| 22 | to use SIMD feature of your CPU.
|
---|
| 23 | </p>
|
---|
| 24 |
|
---|
| 25 | <h3>1. First Step: Compile test programs using Makefile.</h3>
|
---|
| 26 | <h4>1-1. Compile standard C test program.</h4>
|
---|
| 27 | <p>
|
---|
| 28 | Check if dSFMT.c and Makefile are in your current directory.
|
---|
| 29 | If not, <strong>cd</strong> to the directory where they exist.
|
---|
| 30 | Then, type
|
---|
| 31 | </p>
|
---|
| 32 | <blockquote>
|
---|
| 33 | <pre>make std</pre>
|
---|
| 34 | </blockquote>
|
---|
| 35 | <p>
|
---|
| 36 | If it causes an error, try to type
|
---|
| 37 | </p>
|
---|
| 38 | <blockquote>
|
---|
| 39 | <pre>cc -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c</pre>
|
---|
| 40 | </blockquote>
|
---|
| 41 | <p>
|
---|
| 42 | or try to type
|
---|
| 43 | </p>
|
---|
| 44 | <blockquote>
|
---|
| 45 | <pre>gcc -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c</pre>
|
---|
| 46 | </blockquote>
|
---|
| 47 | <p>
|
---|
| 48 | If success, then check the test program. Type
|
---|
| 49 | </p>
|
---|
| 50 | <blockquote>
|
---|
| 51 | <pre>./test-std-M19937 -v</pre>
|
---|
| 52 | </blockquote>
|
---|
| 53 | <p>
|
---|
| 54 | You will see many random numbers displayed on your screen.
|
---|
| 55 | If you want to check these random numbers are correct output,
|
---|
| 56 | redirect output to a file and <strong>diff</strong> it with
|
---|
| 57 | <strong>dSFMT.19937.out.txt</strong>, like this:</p>
|
---|
| 58 | <blockquote>
|
---|
| 59 | <pre>./test-std-M19937 -v > foo.txt
|
---|
| 60 | diff -w foo.txt dSFMT.19937.out.txt</pre>
|
---|
| 61 | </blockquote>
|
---|
| 62 | <p>
|
---|
| 63 | Silence means they are the same because <strong>diff</strong>
|
---|
| 64 | reports the difference of two files.
|
---|
| 65 | </p>
|
---|
| 66 | <p>
|
---|
| 67 | If you want to know the generation speed of dSFMT, type
|
---|
| 68 | </p>
|
---|
| 69 | <blockquote>
|
---|
| 70 | <pre>./test-std-M19937 -s</pre>
|
---|
| 71 | </blockquote>
|
---|
| 72 | <p>
|
---|
| 73 | It is very slow. To make it fast, compile it
|
---|
| 74 | with <strong>-O3</strong> option. If your compiler is gcc, you
|
---|
| 75 | should specify <strong>-fno-strict-aliasing</strong> option
|
---|
| 76 | with <strong>-O3</strong>. type
|
---|
| 77 | </p>
|
---|
| 78 | <blockquote>
|
---|
| 79 | <pre>gcc -O3 -fno-strict-aliasing -DDSFMT_MEXP=19937 -o test-std-M19937 dSFMT.c test.c
|
---|
| 80 | ./test-std-M19937 -s</pre>
|
---|
| 81 | </blockquote>
|
---|
| 82 | <p>
|
---|
| 83 | If you are using gcc 4.0, you will get more performance of dSFMT
|
---|
| 84 | by giving additional options
|
---|
| 85 | <strong>--param max-inline-insns-single=1800</strong>,
|
---|
| 86 | <strong>--param inline-unit-growth=500</strong> and
|
---|
| 87 | <strong>--param large-function-growth=900</strong>.
|
---|
| 88 | </p>
|
---|
| 89 |
|
---|
| 90 | <h4>1-2. Compile SSE2 test program.</h4>
|
---|
| 91 | <p>
|
---|
| 92 | If your CPU supports SSE2 and you can use gcc version 3.4 or later,
|
---|
| 93 | you can make test-sse2-M19937. To do this, type
|
---|
| 94 | </p>
|
---|
| 95 | <blockquote>
|
---|
| 96 | <pre>make sse2</pre>
|
---|
| 97 | </blockquote>
|
---|
| 98 | <p>or type</p>
|
---|
| 99 | <blockquote>
|
---|
| 100 | <pre>gcc -O3 -msse2 -fno-strict-aliasing -DHAVE_SSE2=1 -DDSFMT_MEXP=19937 -o test-sse2-M19937 dSFMT.c test.c</pre>
|
---|
| 101 | </blockquote>
|
---|
| 102 | <p>If everything works well,</p>
|
---|
| 103 | <blockquote>
|
---|
| 104 | <pre>./test-sse2-M19937 -s</pre>
|
---|
| 105 | </blockquote>
|
---|
| 106 | <p>shows much shorter time than <strong>test-std-M19937 -s</strong>.</p>
|
---|
| 107 |
|
---|
| 108 | <h4>1-3. Compile AltiVec test program.</h4>
|
---|
| 109 | <p>
|
---|
| 110 | If you are using Macintosh computer with PowerPC G4 or G5, and
|
---|
| 111 | your gcc version is later 3.3, you can make test-alti-M19937. To
|
---|
| 112 | do this, type
|
---|
| 113 | </p>
|
---|
| 114 | <blockquote>
|
---|
| 115 | <pre>make osx-alti</pre>
|
---|
| 116 | </blockquote>
|
---|
| 117 | <p>or type</p>
|
---|
| 118 | <blockquote>
|
---|
| 119 | <pre>gcc -O3 -faltivec -fno-strict-aliasing -DHAVE_ALTIVEC=1 -DDSFMT_MEXP=19937 -o test-alti-M19937 dSFMT.c test.c</pre>
|
---|
| 120 | </blockquote>
|
---|
| 121 | <p>If everything works well,</p>
|
---|
| 122 | <blockquote>
|
---|
| 123 | <pre>./test-alti-M19937 -s</pre>
|
---|
| 124 | </blockquote>
|
---|
| 125 | <p>shows much shorter time than <strong>test-std-M19937 -s</strong>.</p>
|
---|
| 126 |
|
---|
| 127 | <h4>1-4. Compile and check output automatically.</h4>
|
---|
| 128 | <p>
|
---|
| 129 | To make test program and check output
|
---|
| 130 | automatically for all supported SFMT_MEXPs of dSFMT, type
|
---|
| 131 | </p>
|
---|
| 132 | <blockquote>
|
---|
| 133 | <pre>make std-check</pre>
|
---|
| 134 | </blockquote>
|
---|
| 135 | <p>
|
---|
| 136 | To check test program optimized for SSE2, type
|
---|
| 137 | </p>
|
---|
| 138 | <blockquote>
|
---|
| 139 | <pre>make sse2-check</pre>
|
---|
| 140 | </blockquote>
|
---|
| 141 | <p>
|
---|
| 142 | To check test program optimized for OSX PowerPC AltiVec, type
|
---|
| 143 | </p>
|
---|
| 144 | <blockquote>
|
---|
| 145 | <pre>make osx-alti-check</pre>
|
---|
| 146 | </blockquote>
|
---|
| 147 | <p>
|
---|
| 148 | These commands may take some time.
|
---|
| 149 | </p>
|
---|
| 150 |
|
---|
| 151 | <h3>2. Second Step: Use dSFMT pseudorandom number generator with
|
---|
| 152 | your C program.</h3>
|
---|
| 153 | <h4>2-1. Use sequential call and static link.</h4>
|
---|
| 154 | <p>
|
---|
| 155 | Here is a very simple program <strong>sample1.c</strong> which
|
---|
| 156 | calculates PI using Monte-Carlo method.
|
---|
| 157 | </p>
|
---|
| 158 | <blockquote>
|
---|
| 159 | <pre>
|
---|
| 160 | #include <stdio.h>
|
---|
| 161 | #include <stdlib.h>
|
---|
| 162 | #include "dSFMT.h"
|
---|
| 163 |
|
---|
| 164 | int main(int argc, char* argv[]) {
|
---|
| 165 | int i, cnt, seed;
|
---|
| 166 | double x, y, pi;
|
---|
| 167 | const int NUM = 10000;
|
---|
| 168 | dsfmt_t dsfmt;
|
---|
| 169 |
|
---|
| 170 | if (argc >= 2) {
|
---|
| 171 | seed = strtol(argv[1], NULL, 10);
|
---|
| 172 | } else {
|
---|
| 173 | seed = 12345;
|
---|
| 174 | }
|
---|
| 175 | cnt = 0;
|
---|
| 176 | dsfmt_init_gen_rand(&dsfmt, seed);
|
---|
| 177 | for (i = 0; i < NUM; i++) {
|
---|
| 178 | x = dsfmt_genrand_close_open(&dsfmt);
|
---|
| 179 | y = dsfmt_genrand_close_open(&dsfmt);
|
---|
| 180 | if (x * x + y * y < 1.0) {
|
---|
| 181 | cnt++;
|
---|
| 182 | }
|
---|
| 183 | }
|
---|
| 184 | pi = (double)cnt / NUM * 4;
|
---|
| 185 | printf("%f\n", pi);
|
---|
| 186 | return 0;
|
---|
| 187 | }
|
---|
| 188 | </pre>
|
---|
| 189 | </blockquote>
|
---|
| 190 | <p>To compile <strong>sample1.c</strong> with dSFMT.c with the period of
|
---|
| 191 | 2<sup>607</sup>, type</p>
|
---|
| 192 | <blockquote>
|
---|
| 193 | <pre>gcc -DDSFMT_MEXP=521 -o sample1 dSFMT.c sample1.c</pre>
|
---|
| 194 | </blockquote>
|
---|
| 195 | <p>If your CPU supports SSE2 and you want to use optimized dSFMT for
|
---|
| 196 | SSE2, type</p>
|
---|
| 197 | <blockquote>
|
---|
| 198 | <pre>gcc -msse2 -DDSFMT_MEXP=521 -DHAVE_SSE2 -o sample1 dSFMT.c sample1.c</pre>
|
---|
| 199 | </blockquote>
|
---|
| 200 | <p>If your Computer is Apple PowerPC G4 or G5 and you want to use
|
---|
| 201 | optimized dSFMT for AltiVec, type</p>
|
---|
| 202 | <blockquote>
|
---|
| 203 | <pre>gcc -faltivec -DDSFMT_MEXP=521 -DHAVE_ALTIVEC -o sample1 dSFMT.c sample1.c</pre>
|
---|
| 204 | </blockquote>
|
---|
| 205 |
|
---|
| 206 | <h4>2-2. Use block call and static link.</h4>
|
---|
| 207 | <p>
|
---|
| 208 | Here is <strong>sample2.c</strong> which modifies sample1.c.
|
---|
| 209 | The block call <strong>dsfmt_fill_array_close_open</strong> is
|
---|
| 210 | much faster than sequential call, but it needs an aligned
|
---|
| 211 | memory. The standard function to get an aligned memory
|
---|
| 212 | is <strong>posix_memalign</strong>, but it isn't usable in every
|
---|
| 213 | OS.
|
---|
| 214 | </p>
|
---|
| 215 | <blockquote>
|
---|
| 216 | <pre>
|
---|
| 217 | #include <stdio.h>
|
---|
| 218 | #define _XOPEN_SOURCE 600
|
---|
| 219 | #include <stdlib.h>
|
---|
| 220 | #include "dSFMT.h"
|
---|
| 221 |
|
---|
| 222 | int main(int argc, char* argv[]) {
|
---|
| 223 | int i, j, cnt, seed;
|
---|
| 224 | double x, y, pi;
|
---|
| 225 | const int NUM = 10000;
|
---|
| 226 | const int R_SIZE = 2 * NUM;
|
---|
| 227 | int size;
|
---|
| 228 | double *array;
|
---|
| 229 | dsfmt_t dsfmt;
|
---|
| 230 |
|
---|
| 231 | if (argc >= 2) {
|
---|
| 232 | seed = strtol(argv[1], NULL, 10);
|
---|
| 233 | } else {
|
---|
| 234 | seed = 12345;
|
---|
| 235 | }
|
---|
| 236 | size = dsfmt_get_min_array_size();
|
---|
| 237 | if (size < R_SIZE) {
|
---|
| 238 | size = R_SIZE;
|
---|
| 239 | }
|
---|
| 240 | #if defined(__APPLE__) || \
|
---|
| 241 | (defined(__FreeBSD__) && __FreeBSD__ >= 3 && __FreeBSD__ <= 6)
|
---|
| 242 | printf("malloc used\n");
|
---|
| 243 | array = malloc(sizeof(double) * size);
|
---|
| 244 | if (array == NULL) {
|
---|
| 245 | printf("can't allocate memory.\n");
|
---|
| 246 | return 1;
|
---|
| 247 | }
|
---|
| 248 | #elif defined(_POSIX_C_SOURCE)
|
---|
| 249 | printf("posix_memalign used\n");
|
---|
| 250 | if (posix_memalign((void **)&array, 16, sizeof(double) * size) != 0) {
|
---|
| 251 | printf("can't allocate memory.\n");
|
---|
| 252 | return 1;
|
---|
| 253 | }
|
---|
| 254 | #elif defined(__GNUC__) && (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 3))
|
---|
| 255 | printf("memalign used\n");
|
---|
| 256 | array = memalign(16, sizeof(double) * size);
|
---|
| 257 | if (array == NULL) {
|
---|
| 258 | printf("can't allocate memory.\n");
|
---|
| 259 | return 1;
|
---|
| 260 | }
|
---|
| 261 | #else /* in this case, gcc doesn't suppport SSE2 */
|
---|
| 262 | array = malloc(sizeof(double) * size);
|
---|
| 263 | if (array == NULL) {
|
---|
| 264 | printf("can't allocate memory.\n");
|
---|
| 265 | return 1;
|
---|
| 266 | }
|
---|
| 267 | #endif
|
---|
| 268 | cnt = 0;
|
---|
| 269 | j = 0;
|
---|
| 270 | dsfmt_init_gen_rand(&dsfmt, seed);
|
---|
| 271 | dsfmt_fill_array_close_open(&dsfmt, array, size);
|
---|
| 272 | for (i = 0; i < NUM; i++) {
|
---|
| 273 | x = array[j++];
|
---|
| 274 | y = array[j++];
|
---|
| 275 | if (x * x + y * y < 1.0) {
|
---|
| 276 | cnt++;
|
---|
| 277 | }
|
---|
| 278 | }
|
---|
| 279 | free(array);
|
---|
| 280 | pi = (double)cnt / NUM * 4;
|
---|
| 281 | printf("%f\n", pi);
|
---|
| 282 | return 0;
|
---|
| 283 | }
|
---|
| 284 | </pre>
|
---|
| 285 | </blockquote>
|
---|
| 286 | <p>To compile <strong>sample2.c</strong> with dSFMT.c with the period of
|
---|
| 287 | 2<sup>2281</sup>, type</p>
|
---|
| 288 | <blockquote>
|
---|
| 289 | <pre>gcc -DDSFMT_MEXP=2203 -o sample2 dSFMT.c sample2.c</pre>
|
---|
| 290 | </blockquote>
|
---|
| 291 | <p>If your CPU supports SSE2 and you want to use optimized dSFMT for
|
---|
| 292 | SSE2, type</p>
|
---|
| 293 | <blockquote>
|
---|
| 294 | <pre>gcc -msse2 -DDSFMT_MEXP=2203 -DHAVE_SSE2 -o sample2 dSFMT.c sample2.c</pre>
|
---|
| 295 | </blockquote>
|
---|
| 296 | <p>If your computer is Apple PowerPC G4 or G5 and you want to use
|
---|
| 297 | optimized dSFMT for AltiVec, type</p>
|
---|
| 298 | <blockquote>
|
---|
| 299 | <pre>gcc -faltivec -DDSFMT_MEXP=2203 -DHAVE_ALTIVEC -o sample2 dSFMT.c sample2.c</pre>
|
---|
| 300 | </blockquote>
|
---|
| 301 | <h4>2-3. Initialize dSFMT using dsfmt_init_by_array function.</h4>
|
---|
| 302 | <p>
|
---|
| 303 | Here is <strong>sample3.c</strong> which modifies sample1.c.
|
---|
| 304 | The 32-bit integer seed can only make 2<sup>32</sup> kinds of
|
---|
| 305 | initial state, to avoid this problem, dSFMT
|
---|
| 306 | provides <strong>dsfmt_init_by_array</strong> function. This sample
|
---|
| 307 | uses dsfmt_init_by_array function which initialize the internal state
|
---|
| 308 | array with an array of 32-bit. The size of an array can be
|
---|
| 309 | larger than the internal state array and all elements of the
|
---|
| 310 | array are used for initialization, but too large array is
|
---|
| 311 | wasteful.
|
---|
| 312 | </p>
|
---|
| 313 | <blockquote>
|
---|
| 314 | <pre>
|
---|
| 315 | #include <stdio.h>
|
---|
| 316 | #include <string.h>
|
---|
| 317 | #include "dSFMT.h"
|
---|
| 318 |
|
---|
| 319 | int main(int argc, char* argv[]) {
|
---|
| 320 | int i, cnt, seed_cnt;
|
---|
| 321 | double x, y, pi;
|
---|
| 322 | const int NUM = 10000;
|
---|
| 323 | uint32_t seeds[100];
|
---|
| 324 | dsfmt_t dsfmt;
|
---|
| 325 |
|
---|
| 326 | if (argc >= 2) {
|
---|
| 327 | seed_cnt = 0;
|
---|
| 328 | for (i = 0; (i < 100) && (i < strlen(argv[1])); i++) {
|
---|
| 329 | seeds[i] = argv[1][i];
|
---|
| 330 | seed_cnt++;
|
---|
| 331 | }
|
---|
| 332 | } else {
|
---|
| 333 | seeds[0] = 12345;
|
---|
| 334 | seed_cnt = 1;
|
---|
| 335 | }
|
---|
| 336 | cnt = 0;
|
---|
| 337 | dsfmt_init_by_array(&dsfmt, seeds, seed_cnt);
|
---|
| 338 | for (i = 0; i < NUM; i++) {
|
---|
| 339 | x = dsfmt_genrand_close_open(&dsfmt);
|
---|
| 340 | y = dsfmt_genrand_close_open(&dsfmt);
|
---|
| 341 | if (x * x + y * y < 1.0) {
|
---|
| 342 | cnt++;
|
---|
| 343 | }
|
---|
| 344 | }
|
---|
| 345 | pi = (double)cnt / NUM * 4;
|
---|
| 346 | printf("%f\n", pi);
|
---|
| 347 | return 0;
|
---|
| 348 | }
|
---|
| 349 | </pre>
|
---|
| 350 | </blockquote>
|
---|
| 351 | <p>To compile <strong>sample3.c</strong>, type</p>
|
---|
| 352 | <blockquote>
|
---|
| 353 | <pre>gcc -DDSFMT_MEXP=1279 -o sample3 dSFMT.c sample3.c</pre>
|
---|
| 354 | </blockquote>
|
---|
| 355 | <p>Now, seed can be a string. Like this:</p>
|
---|
| 356 | <blockquote>
|
---|
| 357 | <pre>./sample3 your-full-name</pre>
|
---|
| 358 | </blockquote>
|
---|
| 359 | </body>
|
---|
| 360 | </html>
|
---|