source: Daodan/MSYS2/mingw32/share/doc/mpfr/TODO@ 1167

Last change on this file since 1167 was 1166, checked in by rossy, 3 years ago

Daodan: Replace MinGW build env with an up-to-date MSYS2 env

File size: 35.0 KB
Line 
1Copyright 1999-2020 Free Software Foundation, Inc.
2Contributed by the AriC and Caramba projects, INRIA.
3
4This file is part of the GNU MPFR Library.
5
6The GNU MPFR Library is free software; you can redistribute it and/or modify
7it under the terms of the GNU Lesser General Public License as published by
8the Free Software Foundation; either version 3 of the License, or (at your
9option) any later version.
10
11The GNU MPFR Library is distributed in the hope that it will be useful, but
12WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
13or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
14License for more details.
15
16You should have received a copy of the GNU Lesser General Public License
17along with the GNU MPFR Library; see the file COPYING.LESSER. If not, see
18https://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
1951 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
20
21Table of contents:
221. Documentation
232. Compiler/library detection
243. Changes in existing functions
254. New functions to implement
265. Efficiency
276. Miscellaneous
287. Portability
29
30##############################################################################
311. Documentation
32##############################################################################
33
34- add a description of the algorithms used and a proof of correctness
35
36##############################################################################
372. Compiler/library detection
38##############################################################################
39
40- update ICC detection.
41 * Use only __INTEL_COMPILER instead of the obsolete macro __ICC?
42
43##############################################################################
443. Changes in existing functions
45##############################################################################
46
47- export mpfr_overflow and mpfr_underflow as public functions
48
49- many functions currently taking into account the precision of the *input*
50 variable to set the initial working precision (acosh, asinh, cosh, ...).
51 This is nonsense since the "average" working precision should only depend
52 on the precision of the *output* variable (and maybe on the *value* of
53 the input in case of cancellation).
54 -> remove those dependencies from the input precision.
55
56- mpfr_can_round:
57 change the meaning of the 2nd argument (err). Currently the error is
58 at most 2^(MPFR_EXP(b)-err), i.e. err is the relative shift wrt the
59 most significant bit of the approximation. I propose that the error
60 is now at most 2^err ulps of the approximation, i.e.
61 2^(MPFR_EXP(b)-MPFR_PREC(b)+err).
62
63- mpfr_set_q first tries to convert the numerator and the denominator
64 to mpfr_t. But this conversion may fail even if the correctly rounded
65 result is representable. New way to implement:
66 Function q = a/b. nq = PREC(q) na = PREC(a) nb = PREC(b)
67 If na < nb
68 a <- a*2^(nb-na)
69 n <- na-nb+ (HIGH(a,nb) >= b)
70 if (n >= nq)
71 bb <- b*2^(n-nq)
72 a = q*bb+r --> q has exactly n bits.
73 else
74 aa <- a*2^(nq-n)
75 aa = q*b+r --> q has exactly n bits.
76 If RNDN, takes nq+1 bits. (See also the new division function).
77
78- revisit the conversion functions between a MPFR number and a native
79 floating-point value.
80 * Consequences if some exception is trapped?
81 * Specify under which conditions (current rounding direction and
82 precision of the FPU, whether a format has been recognized...),
83 correct rounding is guaranteed. Fix the code if need be. Do not
84 forget subnormals.
85 * Provide mpfr_buildopt_* functions to tell whether the format of a
86 native type (float / double / long double) has been recognized and
87 which format it is?
88 * For functions that return a native floating-point value (mpfr_get_flt,
89 mpfr_get_d, mpfr_get_ld, mpfr_get_decimal64), in case of underflow or
90 overflow, follow the convention used for the functions in <math.h>?
91 See §7.12.1 "Treatment of error conditions" of ISO C11, which provides
92 two ways of handling error conditions, depending on math_errhandling:
93 errno (to be set to ERANGE here) and floating-point exceptions.
94 If floating-point exceptions need to be generated, do not use
95 feraiseexcept(), as this function may require the math library (-lm);
96 use a floating-point expression instead, such as DBL_MIN * DBL_MIN
97 (underflow) or DBL_MAX * DBL_MAX (overflow), which are probably safe
98 as used in the GNU libc implementation.
99 * For testing the lack of subnormal support:
100 see the -mfpu GCC option for ARM and
101 https://en.wikipedia.org/wiki/Denormal_number#Disabling_denormal_floats_at_the_code_level
102
103
104##############################################################################
1054. New functions to implement
106##############################################################################
107
108- a function to compute the hash of a floating-point number
109 (suggested by Patrick Pelissier)
110- implement new functions from the C++17 standard:
111 http://en.cppreference.com/w/cpp/numeric/special_math
112 assoc_laguerre, assoc_legendre, comp_ellint_1, comp_ellint_2, comp_ellint_3,
113 cyl_bessel_i, cyl_bessel_j, cyl_bessel_k, cyl_neumann, ellint_1, ellint_2,
114 ellint_3, hermite, legendre, laguerre, sph_bessel, sph_legendre,
115 sph_neumann.
116 Already in mpfr4: beta and riemann_zeta.
117 See also https://isocpp.org/files/papers/P0226R1.pdf and §29.9.5 in the
118 C++17 draft:
119 https://github.com/cplusplus/draft/blob/master/source/numerics.tex
120- implement mpfr_q_sub, mpfr_z_div, mpfr_q_div?
121- implement mpfr_pow_q and variants with two integers (native or mpz)
122 instead of a rational? See IEEE P1788.
123- implement functions for random distributions, see for example
124 https://sympa.inria.fr/sympa/arc/mpfr/2010-01/msg00034.html
125 (suggested by Charles Karney <ckarney@Sarnoff.com>, 18 Jan 2010):
126 * a Bernoulli distribution with prob p/q (exact)
127 * a general discrete distribution (i with prob w[i]/sum(w[i]) (Walker
128 algorithm, but make it exact)
129 * a uniform distribution in (a,b)
130 * exponential distribution (mean lambda) (von Neumann's method?)
131 * normal distribution (mean m, s.d. sigma) (ratio method?)
132- wanted for Magma [John Cannon <john@maths.usyd.edu.au>, Tue, 19 Apr 2005]:
133 HypergeometricU(a,b,s) = 1/gamma(a)*int(exp(-su)*u^(a-1)*(1+u)^(b-a-1),
134 u=0..infinity)
135 JacobiThetaNullK
136 PolylogP, PolylogD, PolylogDold: see http://arxiv.org/abs/math.CA/0702243
137 and the references herein.
138 JBessel(n, x) = BesselJ(n+1/2, x)
139 KBessel, KBessel2 [2nd kind]
140 JacobiTheta
141 (see http://www.ams.org/journals/mcom/0000-000-00/S0025-5718-2017-03245-2/home.html)
142 LogIntegral
143 ExponentialIntegralEn (formula 5.1.4 of Abramowitz and Stegun)
144 DawsonIntegral
145 GammaD(x) = Gamma(x+1/2)
146- new functions of IEEE 754-2008, and more generally functions of the
147 C binding draft TS 18661-4:
148 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1946.pdf
149 Some propositions about rootn: mpfr_rootn_si, mpfr_rootn_sj, mpfr_rootn_z,
150 and versions with an unsigned integer: mpfr_rootn_ui (now implemented, as
151 similar to mpfr_root) and mpfr_rootn_uj.
152- functions defined in the LIA-2 standard
153 + minimum and maximum (5.2.2): max, min, max_seq, min_seq, mmax_seq
154 and mmin_seq (mpfr_min and mpfr_max correspond to mmin and mmax);
155 + rounding_rest, floor_rest, ceiling_rest (5.2.4);
156 + remr (5.2.5): x - round(x/y) y;
157 + error functions from 5.2.7 (if useful in MPFR);
158 + power1pm1 (5.3.6.7): (1 + x)^y - 1;
159 + logbase (5.3.6.12): \log_x(y);
160 + logbase1p1p (5.3.6.13): \log_{1+x}(1+y);
161 + rad (5.3.9.1): x - round(x / (2 pi)) 2 pi = remr(x, 2 pi);
162 + axis_rad (5.3.9.1) if useful in MPFR;
163 + cycle (5.3.10.1): rad(2 pi x / u) u / (2 pi) = remr(x, u);
164 + axis_cycle (5.3.10.1) if useful in MPFR;
165 + sinu, cosu, tanu, cotu, secu, cscu, cossinu, arcsinu, arccosu,
166 arctanu, arccotu, arcsecu, arccscu (5.3.10.{2..14}):
167 sin(x 2 pi / u), etc.;
168 [from which sinpi(x) = sin(Pi*x), ... are trivial to implement, with u=2.]
169 + arcu (5.3.10.15): arctan2(y,x) u / (2 pi);
170 + rad_to_cycle, cycle_to_rad, cycle_to_cycle (5.3.11.{1..3}).
171- From GSL, missing special functions (if useful in MPFR):
172 (cf https://www.gnu.org/software/gsl/manual/gsl-ref.html#Special-Functions)
173 + The Airy functions Ai(x) and Bi(x) defined by the integral representations:
174 * Ai(x) = (1/\pi) \int_0^\infty \cos((1/3) t^3 + xt) dt
175 * Bi(x) = (1/\pi) \int_0^\infty (e^(-(1/3) t^3) + \sin((1/3) t^3 + xt)) dt
176 * Derivatives of Airy Functions
177 + The Bessel functions for n integer and n fractional:
178 * Regular Modified Cylindrical Bessel Functions I_n
179 * Irregular Modified Cylindrical Bessel Functions K_n
180 * Regular Spherical Bessel Functions j_n: j_0(x) = \sin(x)/x,
181 j_1(x)= (\sin(x)/x-\cos(x))/x & j_2(x)= ((3/x^2-1)\sin(x)-3\cos(x)/x)/x
182 Note: the "spherical" Bessel functions are solutions of
183 x^2 y'' + 2 x y' + [x^2 - n (n+1)] y = 0 and satisfy
184 j_n(x) = sqrt(Pi/(2x)) J_{n+1/2}(x). They should not be mixed with the
185 classical Bessel Functions, also noted j0, j1, jn, y0, y1, yn in C99
186 and mpfr.
187 Cf https://en.wikipedia.org/wiki/Bessel_function#Spherical_Bessel_functions
188 *Irregular Spherical Bessel Functions y_n: y_0(x) = -\cos(x)/x,
189 y_1(x)= -(\cos(x)/x+\sin(x))/x &
190 y_2(x)= (-3/x^3+1/x)\cos(x)-(3/x^2)\sin(x)
191 * Regular Modified Spherical Bessel Functions i_n:
192 i_l(x) = \sqrt{\pi/(2x)} I_{l+1/2}(x)
193 * Irregular Modified Spherical Bessel Functions:
194 k_l(x) = \sqrt{\pi/(2x)} K_{l+1/2}(x).
195 + Clausen Function:
196 Cl_2(x) = - \int_0^x dt \log(2 \sin(t/2))
197 Cl_2(\theta) = \Im Li_2(\exp(i \theta)) (dilogarithm).
198 + Dawson Function: \exp(-x^2) \int_0^x dt \exp(t^2).
199 + Debye Functions: D_n(x) = n/x^n \int_0^x dt (t^n/(e^t - 1))
200 + Elliptic Integrals:
201 * Definition of Legendre Forms:
202 F(\phi,k) = \int_0^\phi dt 1/\sqrt((1 - k^2 \sin^2(t)))
203 E(\phi,k) = \int_0^\phi dt \sqrt((1 - k^2 \sin^2(t)))
204 P(\phi,k,n) = \int_0^\phi dt 1/((1 + n \sin^2(t))\sqrt(1 - k^2 \sin^2(t)))
205 * Complete Legendre forms are denoted by
206 K(k) = F(\pi/2, k)
207 E(k) = E(\pi/2, k)
208 * Definition of Carlson Forms
209 RC(x,y) = 1/2 \int_0^\infty dt (t+x)^(-1/2) (t+y)^(-1)
210 RD(x,y,z) = 3/2 \int_0^\infty dt (t+x)^(-1/2) (t+y)^(-1/2) (t+z)^(-3/2)
211 RF(x,y,z) = 1/2 \int_0^\infty dt (t+x)^(-1/2) (t+y)^(-1/2) (t+z)^(-1/2)
212 RJ(x,y,z,p) = 3/2 \int_0^\infty dt
213 (t+x)^(-1/2) (t+y)^(-1/2) (t+z)^(-1/2) (t+p)^(-1)
214 + Elliptic Functions (Jacobi)
215 + N-relative exponential:
216 exprel_N(x) = N!/x^N (\exp(x) - \sum_{k=0}^{N-1} x^k/k!)
217 + exponential integral:
218 E_2(x) := \Re \int_1^\infty dt \exp(-xt)/t^2.
219 Ei_3(x) = \int_0^x dt \exp(-t^3) for x >= 0.
220 Ei(x) := - PV(\int_{-x}^\infty dt \exp(-t)/t)
221 + Hyperbolic/Trigonometric Integrals
222 Shi(x) = \int_0^x dt \sinh(t)/t
223 Chi(x) := Re[ \gamma_E + \log(x) + \int_0^x dt (\cosh[t]-1)/t]
224 Si(x) = \int_0^x dt \sin(t)/t
225 Ci(x) = -\int_x^\infty dt \cos(t)/t for x > 0
226 AtanInt(x) = \int_0^x dt \arctan(t)/t
227 [ \gamma_E is the Euler constant ]
228 + Fermi-Dirac Function:
229 F_j(x) := (1/r\Gamma(j+1)) \int_0^\infty dt (t^j / (\exp(t-x) + 1))
230 + Pochhammer symbol (a)_x := \Gamma(a + x)/\Gamma(a) : see [Smith01] in
231 algorithms.bib
232 logarithm of the Pochhammer symbol
233 + Gegenbauer Functions
234 + Laguerre Functions
235 + Eta Function: \eta(s) = (1-2^{1-s}) \zeta(s)
236 Hurwitz zeta function: \zeta(s,q) = \sum_0^\infty (k+q)^{-s}.
237 + Lambert W Functions, W(x) are defined to be solutions of the equation:
238 W(x) \exp(W(x)) = x.
239 This function has multiple branches for x < 0 (2 funcs W0(x) and Wm1(x))
240 From Fredrik Johansson:
241 See https://cs.uwaterloo.ca/research/tr/1993/03/W.pdf, in particular
242 formulas 5.2 and 5.3 for the error bound: one first computes an
243 approximation w, and then evaluates the residual w e^w - x. There is an
244 expression for the error in terms of the residual and the derivative W'(t),
245 where the derivative can be bounded by piecewise simple functions,
246 something like min(1, 1/t) when t >= 0.
247 See https://arxiv.org/abs/1705.03266 for rigorous error bounds.
248 + Trigamma Function psi'(x).
249 and Polygamma Function: psi^{(m)}(x) for m >= 0, x > 0.
250- functions from ISO/IEC 24747:2009 (Extensions to the C Library,
251 to Support Mathematical Special Functions).
252 Standard: http://www.iso.org/iso/catalogue_detail.htm?csnumber=38857
253 Draft: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1292.pdf
254 Rationale: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1244.pdf
255 See also: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3060.pdf
256 (similar, for C++).
257 Also check whether the functions that are already implemented in MPFR
258 match this standard.
259
260- from gnumeric (www.gnome.org/projects/gnumeric/doc/function-reference.html):
261 - incomplete beta function, see message from Martin Maechler
262 <maechler@stat.math.ethz.ch> on 18 Jan 2016, and Section 6.6 in
263 Abramowitz & Stegun
264 - betaln
265 - degrees
266 - radians
267 - sqrtpi
268
269- mpfr_inp_raw, mpfr_out_raw (cf mail "Serialization of mpfr_t" from Alexey
270 and answer from Granlund on mpfr list, May 2007)
271- [maybe useful for SAGE] implement companion frac_* functions to the rint_*
272 functions. For example mpfr_frac_floor(x) = x - floor(x). (The current
273 mpfr_frac function corresponds to mpfr_rint_trunc.)
274- scaled erfc (https://sympa.inria.fr/sympa/arc/mpfr/2009-05/msg00054.html)
275- asec, acsc, acot, asech, acsch and acoth (mail from Björn Terelius on mpfr
276 list, 18 June 2009)
277
278- function to reduce the precision of a variable, with a ternary value in
279 input, i.e. taking care of double rounding. Two possible forms: like
280 mpfr_set (i.e. with input and output) or like mpfr_prec_round (i.e. with
281 a single variable). mpfr_subnormalize and mpfr_round_nearest_away_end
282 could use it.
283
284- UBF functions for +, -, *, fmma, /, sqrt.
285 Support UBF in mpfr_check_range or add mpfr_ubf_check_range?
286 Make this available in the API, e.g. for MPC.
287
288- mpfr_cmp_uj and mpfr_cmp_sj. They would be useful to test MPFR with
289 _MPFR_EXP_FORMAT=4.
290
291- base conversion with the round-trip property using a minimal precision,
292 such as the to_chars functions from the C++ standard:
293
294 The functions [...] ensure that the string representation consists
295 of the smallest number of characters such that there is at least
296 one digit before the radix point (if present) and parsing the
297 representation using the corresponding from_chars function
298 recovers value exactly. [Note: This guarantee applies only if
299 to_chars and from_chars are executed on the same implementation.
300 — end note] If there are several such representations, the
301 representation with the smallest difference from the
302 floating-point argument value is chosen, resolving any remaining
303 ties using rounding according to round_to_nearest.
304
305 Text from: https://www.zsh.org/mla/workers/2019/msg01138.html
306
307- Serialization / Deserialization. Suggested by Frédéric Pétrot:
308 https://sympa.inria.fr/sympa/arc/mpfr/2020-02/msg00006.html
309 like mpfr_fpif_{import,export}, but with memory instead of file.
310
311 Idea of implementation to reuse most of the code and change very little:
312
313 Instead of passing a FILE *fh, pass a struct ext_data *h, and instead of
314 using fread and fwrite, use
315 h->read (h, buffer, size)
316 h->write (h, buffer, size)
317 respectively.
318
319 The struct ext_data structure could contain the following fields:
320 * read: pointer to a wrapper function for the read method.
321 * write: pointer to a wrapper function for the write method.
322 * FILE *fh: to be used for operations with files.
323 * unsigned char *arena: to be used for operations with memory.
324
325 The wrapper functions for the read method could be:
326
327 static int
328 read_from_file (struct ext_data *h, unsigned char *buffer, size_t size)
329 {
330 return fread (buffer, size, 1, h->fh) != 1;
331 }
332
333 static int
334 read_from_memory (struct ext_data *h, unsigned char *buffer, size_t size)
335 {
336 if (h->arena == NULL)
337 return 1;
338 memcpy (buffer, h->arena, size);
339 h->arena += size;
340 return 0;
341 }
342
343 So I expect very few changes in the existing code:
344 * Write a few wrapper functions.
345 * Rename mpfr_fpif_export to mpfr_fpif_export_aux and
346 mpfr_fpif_import to mpfr_fpif_import_aux.
347 * In the existing functions, replace FILE *fh, and fread/fwrite
348 calls as mentioned above.
349 * Add new mpfr_fpif_export, mpfr_fpif_import, mpfr_fpif_export_mem,
350 mpfr_fpif_import_mem.
351
352##############################################################################
3535. Efficiency
354##############################################################################
355
356- Fredrik Johansson reports that mpfr_ai is slow for large arguments: an
357 asymptotic expansion should be used (once done, remove REDUCE_EMAX from
358 tests/tai.c and update the description in mpfr.texi).
359- for exp(x), Fredrik Johansson reports a 20% speed improvement starting from
360 4000 bits, and up to a 75% memory improvement in his Arb implementation, by
361 using recursive instead of iterative binary splitting:
362 https://github.com/fredrik-johansson/arb/blob/master/elefun/exp_sum_bs_powtab.c
363- improve mpfr_grandom using the algorithm in http://arxiv.org/abs/1303.6257
364- implement a mpfr_sqrthigh algorithm based on Mulders' algorithm, with a
365 basecase variant
366- use mpn_div_q to speed up mpfr_div. However mpn_div_q, which is new in
367 GMP 5, is not documented in the GMP manual, thus we are not sure it
368 guarantees to return the same quotient as mpn_tdiv_qr.
369 Also mpfr_div uses the remainder computed by mpn_divrem. A workaround would
370 be to first try with mpn_div_q, and if we cannot (easily) compute the
371 rounding, then use the current code with mpn_divrem.
372- improve atanh(x) for small x by using atanh(x) = log1p(2x/(1-x)),
373 and log1p should also be improved for small arguments.
374- compute exp by using the series for cosh or sinh, which has half the terms
375 (see Exercise 4.11 from Modern Computer Arithmetic, version 0.3)
376 The same method can be used for log, using the series for atanh, i.e.,
377 atanh(x) = 1/2*log((1+x)/(1-x)).
378- improve mpfr_gamma (see https://code.google.com/p/fastfunlib/). A possible
379 idea is to implement a fast algorithm for the argument reconstruction
380 gamma(x+k): instead of performing k products by x+i, we could precompute
381 x^2, ..., x^m for m ~ sqrt(k), and perform only sqrt(k) products.
382 One could also use the series for 1/gamma(x), see for example
383 http://dlmf.nist.gov/5/7/ or formula (36) from
384 http://mathworld.wolfram.com/GammaFunction.html
385- improve the computation of Bernoulli numbers: instead of computing just one
386 B[2n] at a time in mpfr_bernoulli_internal, we could compute several at a
387 time, sharing the expensive computation of the 1/p^(2n) series.
388- fix regression with mpfr_mpz_root (from Keith Briggs, 5 July 2006), for
389 example on 3Ghz P4 with gmp-4.2, x=12.345:
390 prec=50000 k=2 k=3 k=10 k=100
391 mpz_root 0.036 0.072 0.476 7.628
392 mpfr_mpz_root 0.004 0.004 0.036 12.20
393 See also mail from Carl Witty on mpfr list, 09 Oct 2007.
394- for sparse input (say x=1 with 2 bits), mpfr_exp is not faster than for
395 full precision when precision <= MPFR_EXP_THRESHOLD. The reason is
396 that argument reduction kills sparsity. Maybe avoid argument reduction
397 for sparse input?
398- speed up mpfr_atan for large arguments (to speed up mpc_log) see FR #6198
399- improve mpfr_sin on values like ~pi (do not compute sin from cos, because
400 of the cancellation). For instance, reduce the input modulo pi/2 in
401 [-pi/4,pi/4], and define auxiliary functions for which the argument is
402 assumed to be already reduced (so that the sin function can avoid
403 unnecessary computations by calling the auxiliary cos function instead of
404 the full cos function). This will require a native code for sin, for
405 example using the reduction sin(3x)=3sin(x)-4sin(x)^3.
406 See https://sympa.inria.fr/sympa/arc/mpfr/2007-08/msg00001.html and
407 the following messages.
408- improve generic.c to work for number of terms <> 2^k
409- rewrite mpfr_greater_p... as native code.
410
411- mpf_t uses a scheme where the number of limbs actually present can
412 be less than the selected precision, thereby allowing low precision
413 values (for instance small integers) to be stored and manipulated in
414 an mpf_t efficiently.
415
416 Perhaps mpfr should get something similar, especially if looking to
417 replace mpf with mpfr, though it'd be a major change. Alternately
418 perhaps those mpfr routines like mpfr_mul where optimizations are
419 possible through stripping low zero bits or limbs could check for
420 that (this would be less efficient but easier).
421
422- try the idea of the paper "Reduced Cancellation in the Evaluation of Entire
423 Functions and Applications to the Error Function" by W. Gawronski, J. Mueller
424 and M. Reinhard, to be published in SIAM Journal on Numerical Analysis: to
425 avoid cancellation in say erfc(x) for x large, they compute the Taylor
426 expansion of erfc(x)*exp(x^2/2) instead (which has less cancellation),
427 and then divide by exp(x^2/2) (which is simpler to compute).
428
429- replace the *_THRESHOLD macros by global (TLS) variables that can be
430 changed at run time (via a function, like other variables)? One benefit
431 is that users could use a single MPFR binary on several machines (e.g.,
432 a library provided by binary packages or shared via NFS) with different
433 thresholds. On the default values, this would be a bit less efficient
434 than the current code, but this isn't probably noticeable (this should
435 be tested). Something like:
436 long *mpfr_tune_get(void) to get the current values (the first value
437 is the size of the array).
438 int mpfr_tune_set(long *array) to set the tune values.
439 int mpfr_tune_run(long level) to find the best values (the support
440 for this feature is optional, this can also be done with an
441 external function).
442
443- better distinguish different processors (for example Opteron and Core 2)
444 and use corresponding default tuning parameters (as in GMP). This could be
445 done in configure.ac to avoid hacking config.guess, for example define
446 MPFR_HAVE_CORE2.
447 Note (VL): the effect on cross-compilation (that can be a processor
448 with the same architecture, e.g. compilation on a Core 2 for an
449 Opteron) is not clear. The choice should be consistent with the
450 build target (e.g. -march or -mtune value with gcc).
451 Also choose better default values. For instance, the default value of
452 MPFR_MUL_THRESHOLD is 40, while the best values that have been found
453 are between 11 and 19 for 32 bits and between 4 and 10 for 64 bits!
454
455- during the Many Digits competition, we noticed that (our implantation of)
456 Mulders short product was slower than a full product for large sizes.
457 This should be precisely analyzed and fixed if needed.
458
459- for various functions, check the timings as a function of the magnitude
460 of the input (and the input and/or output precisions?), and use better
461 thresholds for asymptotic expansions.
462
463- improve the special case of mpfr_{add,sub} (x, x, y, ...) when |x| > |y|
464 to do the addition in-place and have a complexity of O(prec(y)) in most
465 cases. The mpfr_{add,sub}_{d,ui} functions should automatically benefit
466 from this change.
467
468- in gmp_op.c, for functions with mpz_srcptr, check whether mpz_fits_slong_p
469 is really useful in all cases (see TODO in this file).
470
471- optimize code that uses a test based on the fact that x >> s is
472 undefined in C for s == width of x but the result is expected to
473 be 0. ARM and PowerPC could benefit from such an optimization,
474 but not x86. This needs support from the compiler.
475 For PowerPC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79233
476
477- deal with MPFR_RNDF in mpfr_round_near_x (replaced by MPFR_RNDZ).
478
479- instead of a fixed mparam.h, optionally use function multiversioning
480 (FMV), currently only available with the GNU C++ front end:
481 https://gcc.gnu.org/wiki/FunctionMultiVersioning
482 According to https://lwn.net/Articles/691932/ the dispatch resolution
483 is now done by the dynamic loader, so that this should be fast enough
484 (the cost would be the reading of a static variable, initialized at
485 load time, instead of a constant).
486 In particular, binary package distributions would benefit from FMV as
487 only one binary is generated for different processor families.
488
489
490##############################################################################
4916. Miscellaneous
492##############################################################################
493
494- [suggested by Tobias Burnus <burnus(at)net-b.de> and
495 Asher Langton <langton(at)gcc.gnu.org>, Wed, 01 Aug 2007]
496 support quiet and signaling NaNs in mpfr:
497 * functions to set/test a quiet/signaling NaN: mpfr_set_snan, mpfr_snan_p,
498 mpfr_set_qnan, mpfr_qnan_p
499 * correctly convert to/from double (if encoding of s/qNaN is fixed in 754R)
500 Note: Signaling NaNs are not specified by the ISO C standard and may
501 not be supported by the implementation. GCC needs the -fsignaling-nans
502 option (but this does not affect the C library, which may or may not
503 accept signaling NaNs).
504
505- check the constants mpfr_set_emin (-16382-63) and mpfr_set_emax (16383) in
506 get_ld.c and the other constants, and provide a testcase for large and
507 small numbers.
508
509- from Kevin Ryde <user42@zip.com.au>:
510 Also for pi.c, a pre-calculated compiled-in pi to a few thousand
511 digits would be good value I think. After all, say 10000 bits using
512 1250 bytes would still be small compared to the code size!
513 Store pi in round to zero mode (to recover other modes).
514
515- add other prototypes for round to nearest-away (mpfr_round_nearest_away
516 only deals with the prototypes of say mpfr_sin) or implement it as a native
517 rounding mode
518- add a new roundind mode: round to odd. If the result is not exactly
519 representable, then round to the odd mantissa. This rounding
520 has the nice property that for k > 1, if:
521 y = round(x, p+k, TO_ODD)
522 z = round(y, p, TO_NEAREST_EVEN), then
523 z = round(x, p, TO_NEAREST_EVEN)
524 so it avoids the double-rounding problem.
525 VL: I prefer the (original?) term "sticky rounding", as used in
526 J Strother Moore, Tom Lynch, Matt Kaufmann. A Mechanically Checked
527 Proof of the Correctness of the Kernel of the AMD5K86 Floating-Point
528 Division Algorithm. IEEE Transactions on Computers, 1996.
529 and
530 http://www.russinoff.com/libman/text/node26.html
531
532- new rounding mode MPFR_RNDE when the result is known to be exact?
533 * In normal mode, this would allow MPFR to optimize using
534 this information.
535 * In debug mode, MPFR would check that the result is exact
536 (i.e. that the ternary value is 0).
537
538- add tests of the ternary value for constants
539
540- When doing Extensive Check (--enable-assert=full), since all the
541 functions use a similar use of MACROS (ZivLoop, ROUND_P), it should
542 be possible to do such a scheme:
543 For the first call to ROUND_P when we can round.
544 Mark it as such and save the approximated rounding value in
545 a temporary variable.
546 Then after, if the mark is set, check if:
547 - we still can round.
548 - The rounded value is the same.
549 It should be a complement to tgeneric tests.
550
551- in div.c, try to find a case for which cy != 0 after the line
552 cy = mpn_sub_1 (sp + k, sp + k, qsize, cy);
553 (which should be added to the tests), e.g. by having {vp, k} = 0, or
554 prove that this cannot happen.
555
556- add a configure test for --enable-logging to ignore the option if
557 it cannot be supported. Modify the "configure --help" description
558 to say "on systems that support it".
559
560- add generic bad cases for functions that don't have an inverse
561 function that is implemented (use a single Newton iteration).
562
563- add bad cases for the internal error bound (by using a dichotomy
564 between a bad case for the correct rounding and some input value
565 with fewer Ziv iterations?).
566
567- add an option to use a 32-bit exponent type (int) on LP64 machines,
568 mainly for developers, in order to be able to test the case where the
569 extended exponent range is the same as the default exponent range, on
570 such platforms.
571 Tests can be done with the exp-int branch (added on 2010-12-17, and
572 many tests fail at this time).
573
574- test underflow/overflow detection of various functions (in particular
575 mpfr_exp) in reduced exponent ranges, including ranges that do not
576 contain 0.
577
578- add an internal macro that does the equivalent of the following?
579 MPFR_IS_ZERO(x) || MPFR_GET_EXP(x) <= value
580
581- check whether __gmpfr_emin and __gmpfr_emax could be replaced by
582 a constant (see README.dev). Also check the use of MPFR_EMIN_MIN
583 and MPFR_EMAX_MAX.
584
585- add a test checking that no mpfr.h macros depend on mpfr-impl.h
586 (the current tests cannot check that since mpfr-impl.h is always
587 included).
588
589- move some macro definitions from acinclude.m4 to the m4 directory
590 as suggested by the Automake manual? The reason is that the
591 acinclude.m4 file is big and a bit difficult to read.
592
593- use symbol versioning.
594
595- check whether mpz_t caching (pool) is necessary. Timings with -static
596 with details about the C / C library implementation should be put
597 somewhere as a comment in the source or in the doc. Using -static
598 is important because otherwise the cache saves the dynamic call to
599 mpz_init and mpz_clear; so, what we're measuring is not clear.
600 See thread:
601 https://gmplib.org/list-archives/gmp-devel/2015-September/004147.html
602 Summary: It will not be integrated in GMP because 1) This yields
603 problems with threading (in MPFR, we have TLS variables, but this is
604 not the case of GMP). 2) The gain (if confirmed with -static) would
605 be due to a poor malloc implementation (timings would depend on the
606 platform). 3) Applications would use more RAM.
607 Additional notes [VL]: the major differences in the timings given
608 by Patrick in 2014-01 under Linux were:
609 Before:
610 arccos(x) took 0.054689 ms (32767 eval in 1792 ms)
611 arctan(x) took 0.042116 ms (32767 eval in 1380 ms)
612 After:
613 arccos(x) took 0.043580 ms (32767 eval in 1428 ms)
614 arctan(x) took 0.035401 ms (32767 eval in 1160 ms)
615 mpfr_acos doesn't use mpz, but calls mpfr_atan, so that the issue comes
616 from mpfr_atan, which uses mpz a lot. The problem mainly comes from the
617 reallocations in GMP because mpz_init is used instead of mpz_init2 with
618 the estimated maximum size. Other places in the code that uses mpz_init
619 may be concerned.
620 Issues with mpz_t caching:
621 * The pool can take much memory, which may no longer be useful.
622 For instance:
623 mpfr_init2 (x, 10000000);
624 mpfr_log_ui (x, 17, MPFR_RNDN);
625 /* ... */
626 mpfr_clear (x);
627 /* followed by code using only small precision */
628 while contrary to real caches, they contain no data. This is not
629 valuable memory: freeing/allocating a large block of memory is
630 much faster than the actual computations, so that mpz_t caching
631 has no impact on the performance in such cases. A pool with large
632 blocks also potentially destroys the data locality.
633 * It assumes that the real GMP functions are __gmpz_init and
634 __gmpz_clear, which are not part of the official GMP API, thus
635 is based on GMP internals, which may change in the future or
636 may be different in forks / compatible libraries / etc. This
637 can be solved if MPFR code calls mpfr_mpz_init / mpfr_mpz_clear
638 directly, avoiding the #define's.
639 Questions that need to be answered:
640 * What about the comparisons with other memory allocators?
641 * Shouldn't the pool be part of the memory allocator?
642 For the default memory allocator (malloc): RFE?
643 If it is decided to keep some form of mpz_t caching, a possible solution
644 for both issues: define mpfr_mpz_init2 and mpfr_mpz_clear2, which both
645 take 2 arguments like mpz_init2, where mpfr_mpz_init2 behaves in a way
646 similar to mpz_init2, and mpfr_mpz_clear2 behaves in a way similar to
647 mpz_clear but where the size argument is a hint for the pool; if it is
648 too large, then the mpz_t should not be pushed back to the pool. The
649 size argument of mpfr_mpz_init2 could also be a hint to decide which
650 element to pull from the pool.
651
652- in tsum, add testcases for mpfr_sum triggering the bug fixed in r9722,
653 that is, with a large error during the computation of the secondary term
654 (when the TMD occurs).
655
656- use the keyword "static" in array indices of parameter declarations with
657 C99 compilers (6.7.5.3p7) when the pointer is expected not to be null?
658 For instance, if mpfr.h is changed to have:
659 __MPFR_DECLSPEC void mpfr_dump (const __mpfr_struct [static 1]);
660 and one calls
661 mpfr_dump (NULL);
662 one gets a warning with Clang. This is just an example; this needs to be
663 done in a clean way.
664 See:
665 http://stackoverflow.com/a/3430353/3782797
666 https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html
667
668- change most mpfr_urandomb occurrences to mpfr_urandom in the tests?
669 (The one done in r10573 allowed us to find a bug even without
670 assertion checking.)
671
672- tzeta has been much slower since r9848 (which increases the precision
673 of the input for the low output precisions), at least with the x86
674 32-bit ABI. This seems to come from the fact that the working precision
675 in the mpfr_zeta implementation depends on the precision of the input.
676 Once mpfr_zeta has improved, change the last argument of test_generic
677 in tzeta.c back to 5 (as it was before r10667).
678
679- check the small-precision tables in the tests?
680 This may require to export some pointer to the tables, but this could
681 be done only if some debug macro is defined.
682
683- optionally use malloc() for the caches? See mpfr_mp_memory_cleanup.
684 Note: This can be implemented by adding a TLS flag saying whether we
685 are under cache generation or not, and by making the MPFR allocation
686 functions consider this flag. Moreover, this can only work for mpfr_t
687 caching (floating-point constants), not for mpz_t caching (Bernoulli
688 constants) because we do not have the control of memory allocation for
689 mpz_init.
690
691- use GCC's nonnull attribute (available since GCC 4.0) where applicable.
692
693- avoid the use of MPFR_MANT(x) as an lvalue; use other (more high level)
694 internal macros if possible, such as MPFR_TMP_INIT1, MPFR_TMP_INIT and
695 MPFR_ALIAS.
696
697
698##############################################################################
6997. Portability
700##############################################################################
701
702- add a web page with results of builds on different architectures
703
704- [Kevin about texp.c long strings]
705 For strings longer than c99 guarantees, it might be cleaner to
706 introduce a "tests_strdupcat" or something to concatenate literal
707 strings into newly allocated memory. I thought I'd done that in a
708 couple of places already. Arrays of chars are not much fun.
709
710- use https://gcc.gnu.org/viewcvs/gcc/trunk/config/stdint.m4 for mpfr-gmp.h
711
712- By default, GNU Automake adds -I options to local directories, with
713 the side effect that these directories have the precedence to search
714 for system headers (#include <...>). This may make the build fail if
715 a C implementation includes a file that has the same name as one used
716 in such a directory.
717 For instance, if one adds an empty file "src/bits/types.h", then the
718 MPFR build fails under Linux because /usr/include/stdio.h has
719 #include <bits/types.h>
720 Possible workaround:
721 * disable the default -I options with nostdinc as documented in
722 the Automake manual;
723 * have a rule that copies the needed files ("mpfr.h" or they should
724 be prefixed with "mpfr-") to $(top_builddir)/include;
725 * use "-I$(top_builddir)/include".
Note: See TracBrowser for help on using the repository browser.