PlayStation 3 computing breaks 2^60 barrier 112-bit prime ECDLP solved

Introduction

We are pleased to announce that we set a new record for the elliptic curve discrete logarithm problem (ECDLP) by solving it over a 112-bit finite field. The previous record was for a 109-bit prime field and dates back from October 2002. Our calculation was done on a cluster of PlayStation 3 game consoles at the Laboratory for Cryptologic Algorithms at the EPFL.

What we did

Given the 112-bit prime p = (2 128 – 3) / (11 * 6949), let E be the elliptic curve over the finite field F p of p elements given by the Weierstrass equation y 2 = x 3 + ax + b with
a = 4451685225093714772084598273548424 and
b = 2061118396808653202902996166388514.
 
The point P in the group of points E ( F p ) of E over F p with ( x , y ) coordinates (188281465057972534892223778713752,3419875491033170827167861896082688) has prime order 4451685225093714776491891542548933.
We solved the discrete logarithm with respect to P for the point
 
Q = (1415926535897932384626433832795028, 3846759606494706724286139623885544)
in E ( F p ) and found that Q = 312521636014772477161767351856699 * P . Note that the x -coordinate of Q was chosen as [(π-3)*10 34 ].

How long it took

The calculation was started on January 13, 2009, and finished on July 8, 2009. It ran on and off, occasionally interrupted by other cryptanalytic projects such as this or this or PlayLaB visits as described here (in French), on a cluster of more than 200 PlayStation 3 (PS3) game consoles. When run continuously using the latest version of our code, the same calculation would have taken 3.5 months. In total, about 8.5 * 1016 elliptic curve point additions were performed, which translates to approximately 1018 modular additions and multiplications on 112-bit integers. On the PS3, the effort is equivalent to about 14 full 56-bit DES key searches.

As far as we know our calculation is a new record for ECDLP over prime fields. It beats the previous record of a curve over a 109-bit prime field which was set in 2002 requiring 549 days of computing by more than 10,000 members in almost 250 teams, spending the 2002 equivalent of a year of full time computing on 4000 to 5000 PCs (cf. here and here).

Some details

We used the common parallelized version of Pollard’s rho method, with a 24-bit distinguishing property. We collected more than five billion distinguished points before a collision was found. This is very close to what we expected based on the birthday paradox using the customary 50% probability of success. The distinguished points were stored in uncompressed plaintext format to facilitate collision finding using standard unix commands. Storage required 0.6 Terabyte of disk space.
 
Full details of the computation will be described in a forthcoming publication. Of particular interest is our new 128-bit ‘sloppy reduction’ which combines a redundant representation of residue classes modulo the 112-bit prime p with fast 128-bit truncation. The latter may result in an incorrect result, but that happens with such low probability that it has not occurred even once during the entire calculation. If an error would have occurred, it could have resulted in an incorrect distinguished point, which would have been caught. Sloppy reduction allowed us to make our code branch-free and thereby very efficient on the PS3’s 4-way SIMD Synergistic Processing Units (SPUs). More on our implementation can be found in the Appendix of this paper. Our technique is applicable to other specially formatted primes as well, such as the prime 2255-19 as proposed in this context by Professor Daniel J. Bernstein. We did not use the common negation map since it requires branching and results in code that runs slower in a SIMD environment.
 
Per SPU four trails were processed in SIMD mode, two of those four-tuples were pipelined to avoid all processor latencies, and 50 of those four-tuple pairs were run simultaneously to profit from Montgomery’s simultaneous inversion. With 6 SPUs per PS3, a single PS3 thus processed 2400 trails in parallel, with the entire cluster doing more than half a million trails in parallel.

Why this is relevant

Calculations of this sort may serve to boost our confidence in the strength of elliptic curve cryptography (ECC). To better understand how hard it would be to solve the supposedly hard mathematical problems underlying ECC, we try to solve ECDLP for parameters that are relatively small compared to those that would be used in ECC. The resulting runtimes combined with common extrapolation methods then yield hardness estimates based on which “secure” parameters can be selected or the security of employed ECC systems can be assessed.

A soon to be abolished ECC standard uses 160-bit prime fields. Solving ECDLP over such fields is generally believed to require an effort that is at least 16 million times as large as for 112-bit prime fields. The runtime that we observed for the 112-bit case implies that, even though the 160-bit ECC standard is supposed to be phased out by the end of the year 2010, for the next decade no regular user needs to be overly concerned about the security of 160-bit ECC. More on this issue, and also on the security of 1024-bit RSA moduli, can be found here.

Cryptanalytic potential of a PS3 cluster

In previous work, it was shown that the same PS3 cluster can be used for entirely different cryptanalytic tasks as well, namely collision finding for cryptographic hash functions. This has already extensively been reported on the web and in the cryptographic literature. For further information consult http://www.win.tue.nl/hashclash/ and the links given there. Other work involving symmetric cryptanalysis (such as symmetric key searches) is in preparation.

Our current ECDLP application is quite different and shows that the PS3 is good at asymmetric cryptographic tasks as well. These typically involve modular arithmetic which can, for instance, be applied to ECDLP (as here) and the Elliptic Curve Method (ECM) for factoring integers.

Acknowledgements

We gratefully acknowledge support by the Swiss National Science Foundation under grant numbers 200021-119776 and 206021-117409 and by EPFL DIT.

Authors

Joppe W. Bos1 and Marcelo E. Kaihara1
with Thorsten Kleinjung1, Arjen K. Lenstra1,2 and Peter L. Montgomery3

(1) EPFL IC LACAL, CH-1015 Lausanne, Switzerland
(2) Alcatel-Lucent Bell Laboratories, Murray Hill, NJ, USA
(3) Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA