计算任意大整数的整数平方根 (isqrt) 的有效算法

2024-01-04

Notice

对于解决方案Erlang or C / C++, go to Trial 4 below.

维基百科文章

整数平方根 http://en.wikipedia.org/wiki/Integer_square_root

“整数平方根”的定义可以在这里找到

计算平方根的方法 http://en.wikipedia.org/wiki/Methods_of_computing_square_roots

一种算法可以做到“位魔法”可以在这里找到

[ 试用一：使用库函数 ]

Code

isqrt(N) when erlang:is_integer(N), N >= 0 ->
    erlang:trunc(math:sqrt(N)).

Problem

该实现使用sqrt()来自 C 库的函数，因此它不适用于任意大的整数（请注意，返回的结果与输入不匹配。正确的答案应该是12345678901234567890):

Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.4  (abort with ^G)
1> erlang:trunc(math:sqrt(12345678901234567890 * 12345678901234567890)).
12345678901234567168
2>

[ 试用 2 ：使用 Bigint`+` Only ]

Code

isqrt2(N) when erlang:is_integer(N), N >= 0 ->
    isqrt2(N, 0, 3, 0).

isqrt2(N, I, _, Result) when I >= N ->
    Result;

isqrt2(N, I, Times, Result) ->
    isqrt2(N, I + Times, Times + 2, Result + 1).

描述

该实施基于以下观察：

isqrt(0) = 0   # <--- One 0
isqrt(1) = 1   # <-+
isqrt(2) = 1   #   |- Three 1's
isqrt(3) = 1   # <-+
isqrt(4) = 2   # <-+
isqrt(5) = 2   #   |
isqrt(6) = 2   #   |- Five 2's
isqrt(7) = 2   #   |
isqrt(8) = 2   # <-+
isqrt(9) = 3   # <-+
isqrt(10) = 3  #   |
isqrt(11) = 3  #   |
isqrt(12) = 3  #   |- Seven 3's
isqrt(13) = 3  #   |
isqrt(14) = 3  #   |
isqrt(15) = 3  # <-+
isqrt(16) = 4  # <--- Nine 4's
...

Problem

此实施涉及onlybigint 添加，所以我希望它运行得很快。然而，当我喂它时1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111，它似乎永远在我的（非常快的）机器上运行。

[ 试验 3 ：使用 Bigint 进行二分搜索`+1`, `-1` and `div 2` Only ]

Code

变体1（我最初的实现）

isqrt3(N) when erlang:is_integer(N), N >= 0 ->
    isqrt3(N, 1, N).

isqrt3(_N, Low, High) when High =:= Low + 1 ->
    Low;

isqrt3(N, Low, High) ->
    Mid = (Low + High) div 2,
    MidSqr = Mid * Mid,
    if
        %% This also catches N = 0 or 1
        MidSqr =:= N ->
            Mid;
        MidSqr < N ->
            isqrt3(N, Mid, High);
        MidSqr > N ->
            isqrt3(N, Low, Mid)
    end.

变体 2（修改上面的代码，使边界改为 Mid+1 或 Mid-1，参考维克拉姆·巴特的回答 https://stackoverflow.com/a/21658976/142239)

isqrt3a(N) when erlang:is_integer(N), N >= 0 ->
    isqrt3a(N, 1, N).

isqrt3a(N, Low, High) when Low >= High ->
    HighSqr = High * High,
    if
        HighSqr > N ->
            High - 1;
        HighSqr =< N ->
            High
    end;

isqrt3a(N, Low, High) ->
    Mid = (Low + High) div 2,
    MidSqr = Mid * Mid,
    if
        %% This also catches N = 0 or 1
        MidSqr =:= N ->
            Mid;
        MidSqr < N ->
            isqrt3a(N, Mid + 1, High);
        MidSqr > N ->
            isqrt3a(N, Low, Mid - 1)
    end.

Problem

现在它解决了79位数字（即1111111111111111111111111111111111111111 * 1111111111111111111111111111111111111111）以闪电般的速度，结果立即显示。然而，在我的机器上需要 60 秒（± 2 秒）才能解出一百万（1,000,000）个 61 位数字（即，1000000000000000000000000000000000000000000000000000000000000 to 1000000000000000000000000000000000000000000000000000001000000）。我想做得更快。

[ 试验 4：将牛顿法与 Bigint 结合使用`+` and `div` Only ]

Code

isqrt4(0) -> 0;

isqrt4(N) when erlang:is_integer(N), N >= 0 ->
    isqrt4(N, N).

isqrt4(N, Xk) ->
    Xk1 = (Xk + N div Xk) div 2,
    if
        Xk1 >= Xk ->
            Xk;
        Xk1 < Xk ->
            isqrt4(N, Xk1)
    end.

C / C++ 代码（出于您的兴趣）

递归变体

#include <stdint.h>

uint32_t isqrt_impl(
    uint64_t const n,
    uint64_t const xk)
{
    uint64_t const xk1 = (xk + n / xk) / 2;
    return (xk1 >= xk) ? xk : isqrt_impl(n, xk1);
}

uint32_t isqrt(uint64_t const n)
{
    if (n == 0) return 0;
    if (n == 18446744073709551615ULL) return 4294967295U;
    return isqrt_impl(n, n);
}

迭代变体

#include <stdint.h>

uint32_t isqrt_iterative(uint64_t const n)
{
    uint64_t xk = n;
    if (n == 0) return 0;
    if (n == 18446744073709551615ULL) return 4294967295U;
    do
    {
        uint64_t const xk1 = (xk + n / xk) / 2;
        if (xk1 >= xk)
        {
            return xk;
        }
        else
        {
            xk = xk1;
        }
    } while (1);
}

Problem

Erlang 代码在我的机器上在 40 秒（± 1 秒）内解决了一百万（1,000,000）个 61 位数字，所以这比Trial 3。还能跑得更快吗？

关于我的机器

处理器：3.4 GHz 英特尔酷睿 i7

Memory :32 GB 1600 MHz DDR3

OS :Mac OS X 版本 10.9.1