The p196_mpi page
What is this
This is the home for the program "p196_mpi", a distributed program
whose purpose is to go forward in the palindrome
quest & the study
numbers. For more information, see Wade VanLandingham's
196 and other Lychrel numbers. A
brute force approach will not prove 196 is a Lychrel number (it can
only disprove), but the computational problem itself is interesting.
The code was written as a proof-of-concept for a distributed version of
the algorithm. It is still a work-in-progress and not very clean, but
it achieves its purpose: it is fast on clusters. The code was used on a
network of 10 and 12 dual-sockets, quad-cores nodes, with linear
scaling from 1 to 12 nodes on large problem sizes (using InfiniBand DDR). At over 450M
digits, it still had decent scalability on 60 similar nodes connected
via InfiniBand QDR. The kernel code
should be memory bandwidth-bound on pretty much all x86_64 hardware.
It is distributed under the GPL v2, so anyone
can use, fix or improve it. Optimized computation kernels are welcome,
as only x86_64 (SSE3, SSSE3 or SSE4) and POWER7 (AltiVec/VSX) are
available at the moment. The pure C code is usually compute-bound
rather than memory-bound, and usually not fast enough to be useful for
anything other than testing. Explanations of the algorithm and
implementation are here. The code was also
presented as a research poster at ISC'14 (archive of the poster here).
It is known to still scale decently to 60+ nodes, if the problem
is large enough. Note that the
code is sensitive to network latency. It doesn't work well over
Ethernet, and was exploited only over InfiniBand networks. It
also works fine inside a shared-memory machine. In all cases, you will
need a MPI library for your system, such as OpenMPI; an excellent one to
exploit efficiently intra-node and inter-nodes parallelism is MPC (MultiProcessor Computing).
This code was used to replicate all known results up to 300M digits on
August 27, 2011. It was then used to push the number of iterations to one
billion (over 413M digits), a milestone reached on October 2, 2011.
During test runs on a different system in late January 2012, it reached
500M digits on January 25, 2012. Another series of tests in early July 2012 reached 600M
digits on July 8, 2012. A billion digits was reached in february 2015.
The code is developed primarily under Linux. It was adapted to compile
& run under Windows, but it is much less tested on that platform.
You will need an input datafile to use this, the code cannot start from
a seed. To produce
a compatible input file from a seed number, you can try using the
& Veigneau code linked on
archive.org from here. There is also a backup of that file here.
A sample input (20M digits) is available here.
The current version 1.9.0 is here:
As a gzip'ed tar file
As a zip file
1.9.0: Fixes for large-scale runs
1.8.0: AVX2 support
1.7.0: Minor MPI fixes, add ARM/NEON support