From msuinfo!netnews.upenn.edu!dsinc!spool.mu.edu!howland.reston.ans.net!europa.eng.gtefsd.com!news.umbc.edu!olson Thu Sep 15 13:12:41 1994
Path: msuinfo!netnews.upenn.edu!dsinc!spool.mu.edu!howland.reston.ans.net!europa.eng.gtefsd.com!news.umbc.edu!olson
From: olson@umbc.edu (Bryan G. Olson; CMSC (G))
Newsgroups: sci.crypt
Subject: Re: ** Help me Montgomery ? **
Date: 14 Sep 1994 18:01:46 GMT
Organization: University of Maryland, Baltimore County
Lines: 135
Message-ID: <357dqa$iq@news.umbc.edu>
References: <35700h$nbe@lucy.ee.und.ac.za>
NNTP-Posting-Host: umbc7.umbc.edu
X-Newsreader: TIN [version 1.2 PL2]
Jyri Hamalainen (cat@olddaisy.ee.und.ac.za) wrote:
: Could someone explain the basic logic of
: "Montgomery modular multiplication without trial division",
: method in hardware/ software applications.
Peter Montgomery himself sometimes posts to this group, but
I'll give an introduction to Montgomery multiplication which
may be a bit easier for computer types to follow than his
paper:
Montgomery, Peter L, "Modular Multiplication Without
Trial Division", _Mathematics of Computation_, Vol 44
#170, April 1995, pp 519-521.
To use Montgomery multiplication mod N, we define an alternate
representation of the integers (0,1,2,...N-1) We translate normal
integers to Montgomery representation, do our multiplications with
this new representation, then translate back to the normal
representation.
When computing an exponent mod N, we don't translate back to normal
representation between multiplications; we just leave the results in
Montgomery representation until all the multiplications are done.
To avoid confusion between "mod" as an operation and as notation of
equivalence classes, I'll use "x=y mod n" to mean that x is the least
residue of y mod n, and "x==y mod n" to mean that x is congruent mod n
to y.
To form the Montgomery representation, we choose some R which is
greater than N and relatively prime to N, and we want division by R to
be inexpensive. In crypto work our modulus N usually has no small
factors, so we can choose R to be some power of 2 a little bigger than
N. Division by any power of two is easy, since we just chop off low
order bits.
Now to represent the integers in 0..N-1, we use the Montgomery
representation
M(x) = xR mod N
To convert from Montgomery representation to normal representation, we
find the inverse of R mod N under multiplication mod N. I'll call it
R'. (Actually there are infinitely many inverses; I want the least
residue). Then the inverse of M(x) is
M'(x) = xR' mod N
Since R and N are relatively prime, these functions are one to one on
{ 0,1,2,...N-1 }. All these integers have a unique representation.
Now we find N' such that
0 < N'
Organization: Compulink Information eXchange
References: <35700h$nbe@lucy.ee.und.ac.za>
Date: Thu, 15 Sep 1994 11:03:47 GMT
Lines: 109
In article <35700h$nbe@lucy.ee.und.ac.za> cat@olddaisy.ee.und.ac.za (Jyri Hamalainen) writes:
>
>Could someone explain the basic logic of
>
>"Montgomery modular multiplication without trial division",
>
>method in hardware/ software applications.
>
>Thank you
>
>jYrI
>
Let R be a power of the machine word size. It needs to be greater
than the modulus N, and coprime to it.
Let
R' = R^-1 mod N
N' = -(N^-1) mod R
To compute TR' mod N from T of 0 <= T <= RN:
function REDC(T)
begin
m := (T mod R) * N' mod R;
t := (T + m * N)/R;
if t >= N then return t-N else return t
end;
because
m * N == T * N' * N mod R
(N' * N == -1 mod R)
m * N == T * -1 mod R
== -T mod R
So when we add m * N to T, the result is congruent to 0 mod R.
Therefore the result of the division, t, is an integer. In fact, the
last line of REDC does not need to be executed if we maintain two
extra bits throughout the calculations.
Given two numbers x and y between 0 and N-1 inclusive, let z =
REDC(xy). Then z == (xy)R^-1 mod N, so (x * R^-1)(y * R^-1) == z *
R^-1 mod N. (This implies that there is no need to correct for the
multiplications by R^-1 during the exponentiation.) Also, addition
and subtraction is unchanged in this form.
Also, note that the product T * N' needs only to be carried out at
half precision.
In order to use REDC, we need to convert our numbers into N-residue
form before beginning the calcualtion. To convert an integer x into
N-residue form, compute x*R mod N. To convert an N-residue into an
integer, multiply it by R^-1 mod N.
Multiprecision Case
Instead of computing all of m at once, we can do it a digit at a time.
This allows us to use a single digit n0', which is -N^-1 mod b instead
of -N^-1 mod R. b is the word size of the computer.
The result will not be the same as the original algorithm, but T will
still be a multiple of R.
function REDC(T)
begin
for i := 0 to n-1 do begin
m := T[i] * n0' mod b;
T := T + m * N * b^i
end
return T/R;
end
Note that this is very similar to long multiplication:
begin
for i := 0 to n-1 do begin
T := T + A[i] * B * b^i
end
end
This takes n^2 + n multiplications. Effectively, the modular
multiplication has been reduced to two long multiplications. However,
still further performance improvements can be realized by interleaving
the two phases of multiplication and reduction:
function Mont_Product(A,B)
begin
T := 0;
for i := to to n-1 do begin
T := T + A[i] * B * b^i;
m := T[i] * n0' mod b;
T := T + m * N * b^i;
end;
return T/R;
end
This performance improvement is very valuable if using a DSP, because
multiplies are very fast, and bookkeeping overhead is to be avoided
wherever possible. It still does as many multiplies as the separate
multiplication and reduction, however, so the performance gain will be
small on a more conventional microprocessor.
Andrew.
From msuinfo!agate!howland.reston.ans.net!EU.net!ub4b!idefix.CS.kuleuven.ac.be!news.sri.ucl.ac.be!veithen Thu Sep 15 13:13:21 1994
Path: msuinfo!agate!howland.reston.ans.net!EU.net!ub4b!idefix.CS.kuleuven.ac.be!news.sri.ucl.ac.be!veithen
From: veithen@dice.ucl.ac.be (Veithen Daniel)
Newsgroups: sci.crypt
Subject: Re: ** Help me Montgomery ? **
Date: 15 Sep 1994 10:05:31 GMT
Organization: Universite Catholique de Louvain, Louvain-la-Neuve, Belgium
Lines: 86
Distribution: world
Message-ID: <35969b$csj@sci3.sri.ucl.ac.be>
NNTP-Posting-Host: sphinx.dice.ucl.ac.be
cat@olddaisy.ee.und.ac.za (Jyri Hamalainen) writes:
>
> Could someone explain the basic logic of
>
> "Montgomery modular multiplication without trial division",
>
> method in hardware/ software applications.
>
> Thank you
>
> jYrI
>
Try the following articles about Montgomery's algorithm :
@inproceedings{21,
author = "Duss\'{e}, S.R. and {Kaliski Jr.}, B.S.",
address = "New York",
booktitle = "- Proc. Advances in Cryptology - EUROCRYPT '90",
editor = "Damg{\aa}rd, I.",
pages = "230--244",
publisher = "Springer Verlag",
series = "Lecture Notes in Computer Sciences",
title = "{A cryptographic library for the Motorola DSP56000}",
volume = "473",
year = "1991",
ISBN = ""
}
@inproceedings{n29,
author = "Eldridge, S.E. and Walter, C.D.",
booktitle = "IEEE transactions on computers",
month = jun,
number = "6",
organization = "IEEE",
pages = "693--699",
title = "Hardware implementation of {Montgomery's} modular multiplication algorithm",
volume = "42",
year = "1993",
ISBN = ""
}
@inproceedings{n14,
author = "Walter, C.D.",
booktitle = "IEEE transactions on computers",
month = mar,
number = "3",
organization = "IEEE",
pages = "376--378",
title = "Systolic modular multiplication",
volume = "42",
year = "1993",
ISBN = ""
}
@inproceedings{10,
author = "Arazi, B.",
booktitle = "IEEE Journal on selected areas in communications",
month = jun,
number = "6",
organization = "IEEE",
pages = "761--769",
title = "Double-precision modular multiplication based on a
single-precision modular multiplier and a standard CPU"
volume = "11",
year = "1993",
ISBN = ""
}
Daniel Veithen
Laboratoire de Microelectronique de l'Universite Catholique de Louvain
Batiment Maxwell tf: (+32) 10 47 81 37
Place du Levant, 3 fax: (+32) 10 47 86 67
B-1348 Louvain-La-Neuve e-mail: veithen@dice.ucl.ac.be
Belgium