那么,您将如何加快 ruby 实施速度?尽管它们是不同的语言,但可以应用类似的优化,即记忆和更智能的算法。
1. 记忆
记忆可以防止您一遍又一遍地计算阶乘。
这是您的配对版本:
pairs :: (Integer, Integer) -> [(Integer, Integer)]
pairs (lower, upper) = [(m, n) | m <- [lower..upper], n <- [lower..upper], 1 + factorial n == m*m]
where factorial n = product [1..n]
阶乘多久被调用一次?好吧,我们可以说它至少被调用了upper - lower
次,尽管我们可能不记得之前调用的值。在这种情况下,我们需要(upper - lower)²
阶乘的调用。尽管阶乘计算起来相当简单,但它并不是免费的。
如果我们生成一个无限的阶乘列表并简单地选择正确的阶乘会怎样?
pairsMem :: (Integer, Integer) -> [(Integer, Integer)]
pairsMem (lower, upper) = [(m, n) | m <- [lower..upper], n <- [lower..upper], 1 + factorial n == m*m]
where factorial = (factorials!!) . fromInteger
factorials = scanl (*) 1 [1..]
Now factorials
是列表[1,1,2,6,24,…]
, and factorial
只需查找相应的值即可。两个版本比较如何?
你的版本
main = print $ pairs (0,1000)
> ghc --make SO.hs -O2 -rtsopts > /dev/null
> ./SO.hs +RTS -s
[(5,4),(11,5),(71,7)]
204,022,149,768 bytes allocated in the heap
220,119,948 bytes copied during GC
41,860 bytes maximum residency (2 sample(s))
20,308 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 414079 colls, 0 par 2.39s 2.23s 0.0000s 0.0001s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s
INIT time 0.00s ( 0.00s elapsed)
MUT time 67.33s ( 67.70s elapsed)
GC time 2.39s ( 2.23s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 69.72s ( 69.93s elapsed)
%GC time 3.4% (3.2% elapsed)
Alloc rate 3,030,266,322 bytes per MUT second
Productivity 96.6% of total user, 96.3% of total elapsed
大约68秒。
pairsMem
main = print $ pairsMem (0,1000)
> ghc --make -O2 -rtsopts SO.hs > /dev/null
> ./SO.hs +RTS -s
[(5,4),(11,5),(71,7)]
551,558,988 bytes allocated in the heap
644,420 bytes copied during GC
231,120 bytes maximum residency (2 sample(s))
71,504 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 1159 colls, 0 par 0.00s 0.01s 0.0000s 0.0001s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 2.17s ( 2.18s elapsed)
GC time 0.00s ( 0.01s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 2.17s ( 2.18s elapsed)
%GC time 0.0% (0.3% elapsed)
Alloc rate 253,955,217 bytes per MUT second
Productivity 100.0% of total user, 99.5% of total elapsed
大约两秒或只有原始时间的 3%。对于一个几乎微不足道的改变来说还不错。然而,正如您所看到的,我们使用了两倍的内存。毕竟,我们将阶乘保存在列表中。然而,分配的字节总数是非记忆变体的 0.27%,因为我们不需要重新生成product
.
pairsMem (100,10000)
如果数字很大呢?你说的是(100,1000)
45 分钟后你停了下来。记忆版本有多快?
main = print $ pairsMem (100,10000)
> ghc --make -O2 -rtsopts SO.hs > /dev/null
> ./SO.hs +RTS -s
… 20 minutes later Ctrl+C…
这仍然需要太长时间。我们还能做什么?
2. 更聪明的配对
让我们回到绘图板。您正在检查 (lower,upper) 中的所有对 (n,m)。这合理吗?
事实上,不,因为阶乘增长得非常快。所以对于任何自然数让f(m)
是最大的自然数,使得f(m)! <= m
。现在,对于任意m
,我们只需要检查f(m)
第一个阶乘 - 所有其他阶乘都会更大。
只是为了记录,f(10^100)
is 70.
现在策略很明确了:我们根据需要生成尽可能多的阶乘,然后简单地检查是否m * m - 1
在阶乘列表中:
import Data.Maybe (isJust)
import Data.List (elemIndex)
pairsList :: (Integer, Integer) -> [(Integer, Integer)]
pairsList (lower, upper) = [(m, fromIntegral ret)
| m <- [lower..upper],
let l = elemIndex (m*m - 1) fs,
isJust l,
let Just ret = l
]
where fs = takeWhile (<upper*upper) $ scanl (*) 1 [1..]
这个版本的表现如何pairsMemLim
?
main = print $ pairsList (1, 10^8)
> ghc --make -O2 -rtsopts SO.hs > /dev/null
> ./SO +RTS -s
[(5,4),(11,5),(71,7)]
21,193,518,276 bytes allocated in the heap
2,372,136 bytes copied during GC
58,672 bytes maximum residency (2 sample(s))
19,580 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 40823 colls, 0 par 0.06s 0.11s 0.0000s 0.0000s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s
INIT time 0.00s ( 0.00s elapsed)
MUT time 38.17s ( 38.15s elapsed)
GC time 0.06s ( 0.11s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 38.23s ( 38.26s elapsed)
%GC time 0.2% (0.3% elapsed)
Alloc rate 555,212,922 bytes per MUT second
Productivity 99.8% of total user, 99.8% of total elapsed
好吧,降到40岁。但是,如果我们使用提供更有效查找的数据结构呢?
3. 使用正确的数据结构
由于我们想要有效的查找,我们将使用Set
。功能几乎保持不变,但是fs
将是Set Integer
,并且查找是通过完成的lookupIndex
:
import Data.Maybe (isJust)
import qualified Data.Set as S
pairsSet :: (Integer, Integer) -> [(Integer, Integer)]
pairsSet (lower, upper) = [(m, 1 + fromIntegral ret)
| m <- [lower..upper],
let l = S.lookupIndex (m*m - 1) fs,
isJust l,
let Just ret = l
]
where fs = S.fromList . takeWhile (<upper*upper) $ scanl (*) 1 [1..]
这里的表现是pairsSet
:
> ./SO +RTS -s
[(5,4),(11,5),(71,7)]
18,393,520,096 bytes allocated in the heap
2,069,872 bytes copied during GC
58,752 bytes maximum residency (2 sample(s))
19,580 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 35630 colls, 0 par 0.06s 0.08s 0.0000s 0.0001s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s
INIT time 0.00s ( 0.00s elapsed)
MUT time 18.52s ( 18.52s elapsed)
GC time 0.06s ( 0.08s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 18.58s ( 18.60s elapsed)
%GC time 0.3% (0.4% elapsed)
Alloc rate 993,405,304 bytes per MUT second
Productivity 99.7% of total user, 99.5% of total elapsed
我们的优化之旅到此结束。顺便说一句,我们已经降低了复杂性????(n³)
to ????(n log n)
因为我们的数据结构为我们提供了对数搜索。