The RSA Cryptosystem
by Gonzalo Garcia Estebarena


"Suppose that the cleaning lady gives p and q by mistake to the garbage collector, but that the product pq is saved. How to recover p and q? It must be felt as a defeat for mathematicians that the most promising approaches are searching the garbage dump and applying memo-hypnotic techniques."
Hendrik W. Lenstra Jr.


Brief Introduction to Cryptography
The Development of the Public - Key Cryptosystem
The Rivest - Shamir - Adleman (RSA) System
An Implementation of the System
Appendix A - The Visual Basic Implementation



When Julius Caesar sent messages to his trusted generals, he did not trust his messengers. This led him to develop one of the earliest cryptograms that are known to us. When Caesar wrote the messages to his generals he applied a "shift by 3" cipher. Therefore he replaced every A for a D, every B for an E, every C for an F, an so on through the alphabet. Only the generals who knew this "shift by 3" cipher could understand the message, while all eavesdroppers were kept in ignorance.

Centuries later, during the times of the Cold War, there was a new expanded interest in cryptology. Many great mathematicians in the United States dedicated their efforts to cryptography and cryptanalysis. While cryptography refers to the study of procedures for enciphering plaintext into ciphertext, and then deciphering back into the original plaintext, "cryptanalysis is the science and study of methods of breaking ciphers". Cryptography and cryptanalysis are both studied by cryptology. Nowadays, international spies have already became heroes of the past. With the end of the cold war, the dissolution of the USSR, and the end of the Gulf War, there seems to be no need for keeping up the effort devoted into the study of cryptology.

There might be no need to develop new cryptograms for military communications, yet the world has undergone a great digitalisation era since the Cold War days. Personal computers have spread all over the world and every company in the globe has vital commercial and industrial information stored in a digital form. Before this change, being an industrial spy would have implied having to break into a company and then breaking into a safe located in the basement of the companyís headquarters. Since the widespread of computers, it is now enough to connect oneís personal computer to the companyís server and retrieve from it as much classified information as is required.

At first the methods for protecting the digital information were in some way similar to those used for protecting information written on paper. These consisted of barriers which stopped the intruder or enemy from accessing the information, yet, once these were trespassed, the information could be read and understood directly, like if it werenít secret. In the old days the barriers were the locks and the heavy thick doors, later they were represented by passwords that were needed to access classified information. Yet, if by some way the password checking was avoided, the information was back there ready to be retrieved.

Later, these methods were improved, and cryptographers discovered that it was more important to cipher the information than to restrict the access to it. So less effort was allocated into the development of new barriers and cryptograms were developed. In this way, an intruder could access the information but would not be able to understand it.

With the introduction of networks beyond the physical limits of a company, this proved to have been a good approach to solve the security problems. Except for wired networks within the limits of a companyís building, virtually all other networks can be passively wiretapped. WANs (Wide Area Networks), the Internet and other similar networks are particularly vulnerable to eavesdropping. Most multinational companies use these networks for transmitting secret financial data. Private users also use some of these highways, specially Internet, for commercial purposes, transmitting credit card numbers, bank accounts information and passwords. Therefore it is essential to ensure the users of this systems a relatively high degree of privacy, virtually eliminating all possibilities of successful eavesdropping.

In 1973, a team led by Walter Tuchman developed the Data Encryption Standard, which was based upon a concept originated by Horst Feistel of IBM's Yorktown Research Laboratory. Later, in 1978 Rivest, Shamir and Adleman developed a new scheme, the RSA Cryptosystem, which is still considered to be the strongest, most secure cryptogram.

The purpose of this paper is to analyse the mathematical background of the RSA Cryptosystem and therefore determine whether this system solves the security problem and ensures all users that their privacy will remain safe no matter how much effort is dedicated to breaking their code.


Brief Introduction to Cryptography

Cryptography is defined as "the science and study of secret writing". A cryptosystem or scheme is a method by which plaintext is converted into ciphertext, and then back into plaintext. "The process of transforming plaintext into ciphertext is called encryption; the reverse process is called decryption." Both encryption and decryption are dependant upon a key.

The cryptosystem applied by Julius Caesar would be a substitution cipher in which each letter in the alphabet is substituted for another one. In the message "SEND MORE TROOPS", "SEND MORE TROOPS" would be the plaintext, "VHQG PRUH WURRSV" the cipher text and 3 the key, which indicates by how many letter is the alphabet shifted.

A public-key cryptosystem is one which has two keys: a public and a private one. The private one provides the sole way to compute the inverse process to that computed with the public key. Therefore anyone can encrypt a message using the public key of a certain user, but only this user will be able to decrypt it by using his secret private key.

In cryptography the effectiveness of a system depends on how long it takes to break a cipher with a key n bits long. Time is considered to be a function of n (the length of the key). If this function is polynomial then the problem of breaking the code is considered to be solvable in polynomial time, else if the function is exponential, the solution is referred to as not being obtainable in polynomial time. If the size of the key is enlarged in a polynomial - time problem, then the time needed to break the cipher is not enlarged as much as if it were enlarged in an exponential - time problem. The difference is such that by enlarging the key in an exponential - time problem the same computer technology can be used to calculate the transformations, but a new generation of computer hardware would be needed to break the cryptosystem.

The Development of the Public-Key Cryptosystem

Before the development of cryptosystems which applied the concept of trapdoor functions, the major concern was the key distribution. All classical schemes required a secure channel (usually a trusted courier) through which the key was transferred before the scheme could be used to transfer through the noisy channel (where a possibility of eavesdropping exists). The secure channel is often slow and inefficient, else it would become the channel for frequent communication and there wouldnít be any reason for using the noisy channel as the medium for communications. The speed at which present electronic communications take place and the frequency with which keys are changed make the use of a secure channel a choice which largely reduces the effectiveness of the system.

"Diffie and Hellman addressed this fundamental problem in their seminal paper in 1976." They suggested the idea of a public-key system that avoids the need for a secure channel for the exchange of keys. This scheme relies on what Diffie and Hellman call a "trapdoor function". This is a type of function for which the value of the function can be easily calculated, but its inverse function is computationally infeasible in polynomial time (as the size of the input increases the solution can not be found even on the fastest computers), yet there exists a secret (or private) key which allows an easy calculation of the inverse function.

The encryption (e) and decryption (d) algorithms are public, therefore they are known by all users in a given system. Each user randomly generates a pair of keys: a secret key (Ks) and a public key (Kp). The keys are chosen in such a way that the following identity stands for all random pair of keys:

e(P, Kp) = C,
d(C, Ks) = P,
d(e(P, Kp),Ks) = P,

where Kp is the public key and is made available to all users, Ks is the private key and is therefore kept secret. P is the plaintext, or original message, and C the ciphertext or encrypted message.

This identity implies that the encryption function e, with parameters P and Kp will return the ciphertext C; later, the decryption function d with parameters C and Ks, returns P, the original plaintext. Therefore it implies that e and d are inverse functions of each other.


The procedure followed for communicating with this system is the following:


Supposing that user A is willing to send user B a message:


  1. User A uses the encryption algorithm and Bís public key to create a ciphertext
  2. User B uses the decryption algorithm and his own private key to decipher the text and obtain the plaintext.

If the properties of the trapdoor function proposed by Diffie and Hellman stand, then it should be computationally infeasible to obtain the plaintext from the ciphertext without knowing the private key. At the same time, the transformations done by A and B should be easy enough to make this method practical and efficient. This system then ensures that only user B will be able to read the message sent by user A. Not even user A would be able to read it once it is encrypted.

Still there are three assumptions made in this system:


    1. It is easy to generate pairs of keys Ks and Kp.
    2. The number of possible pairs of keys is big enough to ensure that the chance of two users having the same pair of keys is extremely low.
    3. It is computationally infeasible to calculate the private key Ks from the public key Kp.

In spite of coming up with this idea of trapdoor functions which virtually solved the key distribution problem, Diffie and Hellman were unable to create an implementation of this cryptosystem. Pohling and Hellman were the first to publish an implementation of a trap-door function. Yet, their implementation solved the key distribution problem but could not provide any privacy, since they could never overcome the third assumption.


The Rivest - Shamir - Adleman (RSA) System

At about the same time, in 1978, Rivest, Shamir, and Adleman produced the first useful implementation of a public-key system. The RSA scheme is a cryptosystem based on an exponentiation-modulo procedure.


In exponentiation-modulo ciphers, the encryption and decryption functions are as follows:

The encryption function is given by e(P, Kp), where Kp consists of two numbers E and n. Therefore the encryption function is e(P, E, n). The expression for this encryption function is e(P, E, n) = PE mod n.

Similarly the decryption function is given by d(C, Kp), where Ks consists of two numbers D and n. Therefore the decryption function is d(C, D, n). The expression for this decryption function is d(C, D, n) = CD mod n.


Then the ciphertext is obtained by C = PE mod n

and the plaintext is restored by P = CD mod n


When combined we obtain (PE mod n)D mod n = P

which could be expressed as PED mod n = P


For this to be true E and D should be multiplicative inverses of each other, which means that they should satisfy the relation


ED mod f (n) = 1


This can be proved in the following way:


Eulerís generalisation of Fermatís Theorem states that for every M relatively prime to n,


Mf (n) mod n = 1


where f (n) is the Euler totient function which is the number of elements in the reduced set of residues modulo n. f (n) is the number of positive integers less than n that are coprime to n. For a prime number p, f (p) = p-1; for a composite number n = p x q (p and q being primes), f (n) = (p-1) (q-1).


Now, given that E and D satisfy the relation ED mod f (n) = 1 and M is a message in the interval [0, n-1], such that M is coprime with n, then


ED = tf (n) + 1, for some integer t.


Thus, MED mod n = Mtf (n) + 1 mod n

                            = M Mf (n) mod n

                            = M (Mtf (n) mod n) mod n


Where Mtf (n) mod n = (Mf (n) mod n)t mod n

                                  = 1t mod n

                                  = 1


Thus, MED mod n = Mtf (n) + 1 mod n

                            = M



Having explained this, it becomes clear that for calculating the private decryption key D it is enough to know the public encryption key E and the f (n). When Pohling and Hellman developed their implementation of the cryptosystem they used as n a prime number. But since E and f (n) are enough to obtain D, and f (n) = n-1, for a prime number, they couldnít provide privacy.

Yet, Rivest, Shamir and Adleman had managed to keep D secret after making E and n public by making n = p x q, where p and q are two very large primes. Then , which implies knowing p and q. In this way n can be made public because the only way to obtain p and q is to factor n, which is known to be computationally infeasible in polynomial time.



This system developed by Rivest Shamir and Adleman was observed to have the property of ensuring authenticity. "By symmetry, enciphering and deciphering are commutative and mutual inverses; thus (MD mod n)E mod n = MDE mod n = M. It is because of this symmetry that the RSA scheme can be used for secrecy and authenticity in a public-key system." It was just a matter of applying the transformations in the correct order an using the correct keys. If user A wishes to send user B a signed encrypted message, the procedure works as follows:


  1. User A encrypts the message using Bís public key.
  2. User A encrypts the encrypted message again using itís own private key.
  3. The double encrypted message is sent over a noisy channel

Once this has been done, the only way to obtain the original message back is to decrypt the message twice. First using Aís public key and then using Bís private key. The fact that using Aís public key will finally result in a coherent message implies that Aís private key was used. Assuming that A is the only person with access to this private key, then the authenticity of the sender is confirmed. The fact that Bís private key is needed to decrypt the message ensures that B is the sole recipient of the message.

Therefore the procedure is completed as follows:


  1. User B decrypts the message using Aís public key.
  2. User B decrypts the message using its own private key.
  3. A coherent plaintext is obtained.

An Implementation of the System

Having understood the maths behind the system, then, for this system to represent a possible solution to the security problem, then an implementation of it should be possible to develop. Yet, before starting with the algorithm in itself it is necessary to clarify the procedure and state the series of steps that the program should follow in order to decrypt and encrypt according to the maths explained above.

The system should be able to perform three clearly distinct operations


  1. Generate a pair of keys for a new user.
  2. Decrypt incoming messages using the userís private key.
  3. Encrypt outgoing messages using other userís public key.

Any implementation of the algorithm should be able to handle the extended ASCII (American Standard Code for Information Interchange) character set. This means that each character in the plaintext is represented by 8 bits (1 byte), which means that there are 256 (28) characters, ranging from 0 to 255. As the primes p and q, f (n), n, E and D are all expressed in a decimal base, then in order to compute the exponentiation functions P and C should also be expressed in a decimal base. A string of ASCII characters could be interpreted as a number expressed in base 256.

For example "ME" could be interpreted as representing the decimal number 19781. This is because M is the representation of the ASCII code 77 and E of the code 69. There ME = 256 x 77 + 69 = 19781.

This implies that a careful choice of n (therefore of p and q) should be made. From

, and , we know that determines the highest possible value for C and P to be n-1. Therefore, if for encrypting we divide the plaintext into blocks of one character each, then n should be such that n ³ 256. Similarly, if we are using two character blocks, n should be such that n ³ 2562. As a general rule we should say that n should be such that n ³ 256b, where b indicates the length of the block.

Choosing n ³ 256b has a new implication. It means that, according to , all plaintext P in the interval [0, n-1] can be encrypted. The problem is that it also implies that, according to , any ciphertext in the interval [0, n-1] should be decrypted. It is clear that there can not exist any plaintext in the interval because a string of plaintext with a length of b characters can only represent a number in the interval . Yet, being the encryption function the inverse function of the decryption one and a rearrangement of the elements in the interval [0, n-1], there can exist some C in the interval [256b, n -1] which is the result of encrypting a P in the interval [0, 256b -1]. Therefore if C takes a value in the interval [256b, n -1], then it is not possible to represent it as a string of b ASCII characters.

In order to solve this problem, b should be chosen as a function of n, and not n as a function of b. Then b should be such that b = [log256 n] ([a] represents truncated integer part, e.g. [5.79] = 5). This should be the block length used for encrypting. Then the block length used for decrypting should be b + 1 so as to cover the interval [256b, n -1].



The procedure for generating a pair of keys for a new user will be the following:


  1. Two large primes p and q are generated
  2. n is calculated as the product of the two primes p and q
  3. f (n) is calculated as the product of (p-1) and (q-1)
  4. Another prime E is generated in the range [max (p, q) + 1, n -1]
  5. D is computed such that ED mod f (n) = 1
  6. The key (E, n) is made public, while the key (D, n) is kept private. f (n) is discarded for further security.

The procedure for decrypting an incoming ciphertext C will be the following:


  1. The private key (D, n) of the user is loaded.
  2. Break down C into blocks of length b + 1, where b = [log256 n]
  3. Change the base of the blocks from base 256 to base 10.
  4. Decrypt each block by computing P = CD mod n.
  5. Express each decrypted block as a string of b ASCII characters.

The procedure for encrypting an outgoing message will be the following:


  1. Load the public key of the recipient (E, n).
  2. Break down the message into blocks of b ASCII characters, where b = [log256 n]
  3. Change the base of the blocks from base 256 to base 10.
  4. Encrypt the blocks by performing C = PE mod n.
  5. Express the blocks as a string of b + 1 ASCII characters each, where b =[log256 n].

Testing the Implementation of the System

In order to test the system I decided to develop an implementation myself. It was developed in Microsoft Visual Basic for Windows 95. For computing the multiplicative inverse (D) of E, an extension of the Euclidean algorithm is used (Appendix A).


The key generation procedure


A pair of keys will be generated for user A, an another pair for user B. In this way a communication between user A and B can be simulated.


Key generation for user A


  1. For demonstration purposes, p and q will be chosen to be small primes (in order to be considered big enough so as to provide absolute security they should be 1024 bits long each). Let p = 257 and q = 263
  2. Let n = 257 x 263 = 67591
  3. Let f (n) = (257-1) (263-1) = 67072
  4. Let E = 1033
  5. Let D = 48697
  6. The public key (1033, 67591) is published, while the private key (48697, 67591) is stored secretly by user A.

Encryption Procedure

User B encrypts a message for user A.


The message "Hello!" will be sent.


  1. The public key of user A is known to be (1033, 67591)
  2. The message should be broken down into blocks of b ASCII characters each, where b=[log256 n] and n=67591. Then b = 2. Therefore the plaintext "Hello!" is broken into three blocks: "He"; "ll" and "o!".
  3. The blocks are expressed in base 10.
  4. H = 72; e = 101 from where He = 256 x 72 +101 = 18533.

    Similarly ll = 27756 and o!=28449

  5. The blocks are encrypted by performing C = P1033 mod 67591.
  6. First block: C = 185331033 mod 67591 = 6068

    Second block: C = 277561033 mod 67591 = 257

    Third block: C = 284491033 mod 67591 = 36073

  7. The blocks are expressed with b + 1 = 3 ASCII characters each:
6068 = 0, 23, 180. The first block should be represented with the ASCII characters 0, 23 and 180.

257 = 0, 1, 1. The second block should be represented with the characters 0, 1 and 1.

36073 = 0, 140, 233. The last block should be expressed with the characters 0, 140 and 233.

Most of this numbers have a weird representation which is dependant on the font being used while working in the Windows environment. Yet they can be transmitted through any network and decrypted by the recipient.



Decryption Procedure


User A decrypts the message sent by user B


The received message is the representation of the sequence of ASCII codes:

0, 23, 180, 0, 1, 1, 0, 140, 233.


  1. The private key of user A (48697, 67591) is used for decryption.
  2. The receive message should be broken down into blocks of length b + 1 where:
  3. b = [log256 n] and n = 67591. Then b = 2. Therefore the message is broken into the following three blocks: 0, 23, 180; 0, 1, 1; 0, 140, 233.

  4. The base of the blocks should be changed to a decimal base:
  5. Block 1: 0, 23, 180 = 0 x 2562 + 23 x 256 + 180 = 6068.

    Block 2: 0, 1, 1 = 0 x 2562 + 1 x 256 + 1 = 257.

    Block 3: 0, 140, 233 = 0 x 2562 + 140 x 256 + 233 = 36073.

  6. The blocks are decrypted by P = C48697 mod 67591
  7. Block 1: P = 606848697 mod 67591 = 18533

    Block 2: P = 25748697 mod 67591 = 27756

    Block 3: P = 3607348697 mod 67591 = 28449

  8. Finally the blocks are expressed as a string of b = 2 ASCII characters:
Block 1 = 18533 = 72, 101 = He

Block 2 = 27756 = 76, 76 = ll

Block 3 = 28449 = 111, 33 = o!

Putting the words together the original plaintext "Hello!" is obtained.



The investigation on the mathematical background of the RSA cryptosystem has shown that this system is able to provide both privacy and authenticity functions. It has also shown that the sole known method to crack the system is to compute the private decryption key D from the public encryption key E and the modulo n. In order to do this f (n) must be known, which in the case of n being the product of two prime numbers is given by f (n) = (p-1) (q-1). This implies known the factors p and q of the product n. Since factoring is considered until now a computationally infeasible problem, the keys used can be enlarged easily, making the cracking problem a much harder one to solve. "If p and q are each 1024 bits long, the sun will burn before the most powerful computers presently in existence can factor your modulus into p and q."

This paper has demonstrated that the system can be easily implemented in a home-made application. Even though this application is not able to handle numbers as long as 1024 bits, the privacy that it ensures is enough for any type of amateur communication across a network such as the Internet. More complex commercial packages can be easily obtained by the public to ensure privacy and authenticity in the transfer of industrial secrets, bank account numbers and data of similar importance.

Even though this system is considered to be the most reliable one for secrecy and authenticity, it is important to take into account that it is based on three major assumptions:


    1. The only computationally feasible method to obtain the original plaintext from the ciphertext is to know the private key.
    2. The only way to obtain the private key is to know f (n) by factoring n.
    3. The factoring problem is computationally infeasible.

None of the above assumptions has been proved, which represents a possibility of cracking the system in polynomial time, yet an analyses of them escapes the scope of this investigation.

A field of new development is that in which these packages are being studied by lawyers, judges and legislators so as to decide whether a digital document which has being signed with the authenticity procedure has legal value or not, taking into account that it is harder to create a false digital document than a false paper copy of a contract or something of the sort. Yet, this field exceeds the limits and goals of this paper and would deserve another essay to analyse the for and against of this system.


Appendix A - The Visual Basic Implementation



Dim Shared bl


Public Function Inv(A, N)


Dim G(-1 To 100)

Dim U(-1 To 100)

Dim V(-1 To 100)


G(0) = Val(N)

G(1) = Val(A)

U(0) = 1

V(0) = 0

U(1) = 0

V(1) = 1

I = 1


Do While G(I) <> 0

y = Int(G(I - 1) / G(I))

G(I + 1) = G(I - 1) - y * G(I)

U(I + 1) = U(I - 1) - y * U(I)

V(I + 1) = V(I - 1) - y * V(I)

I = I + 1



x = V(I - 1)

If x >= 0 Then Inv = x Else Inv = x + Val(N)

End Function


Public Sub Update()


bl = 2


Original = RSA.PT


For x = 0 To 16 - Len(RSA.PT) - 1

Original = Original + Chr$(0)

Next x


RSA.N = Val(RSA.P) * Val(RSA.Q)

RSA.FN = (Val(RSA.P) - 1) * (Val(RSA.Q) - 1)


Encrypt (Original)

Decrypt (RSA.CT)

End Sub


Public Sub Encrypt(Text)


ReDim Bloque(0 To 16 / bl)


For Block = 1 To 16 / bl

Bloque(Block) = Encode(Mid$(Text, (Block - 1) * bl + 1, bl))

Next Block


RSA.CT = ""


For x = 0 To 16 / bl - 1

RSA.PTB(x) = Bloque(x + 1)

RSA.CTB(x) = FastExp(RSA.PTB(x), RSA.E, RSA.N)

Next x

End Sub


Public Sub Decrypt(Text)


RSA.DT = ""

For x = 0 To 16 / bl - 1


RSA.DTB(x) = FastExp(RSA.CTV(x), RSA.D, RSA.N)

RSA.DT = RSA.DT + Decode$(RSA.DTB(x))

Next x

End Sub


Public Function FastExp(A, Z, N)


N = Val(N)

A = Val(A)

Z = Val(Z)

x = 1


Do While Z <> 0

Do While Z - Int(Z / 2) * 2 = 0

Z = Int(Z / 2)

A = A ^ 2 - Int(A ^ 2 / N) * N



Z = Z - 1

x = x * A - Int(x * A / N) * N




FastExp = x

End Function


Public Function Encode(Text)


Encode = 0


For x = 1 To Len(Text)

Encode = Encode + 256 ^ (x - 1) * Asc(Mid$(Text, Len(Text) - x + 1, 1))

Next x

End Function


Public Function Decode$(Number)


Number = Val(Number)

Decode = ""



A = Number - Int(Number / 256) * 256

If Number > 0 Then Decode = Chr$(A) + Decode Else Exit Do


Number = (Number - A) / 256


End Function



Private Sub E Change ()


End Sub


Private Sub P Change ()


End Sub


Private Sub PT Change ()


End Sub


Private Sub Q Change ()


End Sub





One application of mathematics is cryptology, the discipline of making and braking codes. This paper analyses the mathematical background of the RSA Cryptosystem to determine whether it solves the security problem and ensures all users that their privacy will remain safe no matter the effort dedicated to breaking their code.

The RSA Cryptosystem is the first implementation of a cryptogram applying a trap-door function. It is based on an exponentiation-modulo function which is used for encryption. A modulo function is a one-way function, nevertheless, there is a way to calculate its inverse by using a different exponential, which makes it a trap-door function. A public exponential is used for encryption, and a private one is used for decryption. This system hides the private key, allowing the release of the public one.

In order to obtain the private key, f (n) (the size of the reduced set of modulo n, where n is the product) should be known. In the case of a number n, which is the product of two prime numbers p and q, f (n) = (p-1) (q-1), which implies knowing the prime numbers. The difficulty to break the system is based upon the assumption that factoring is a process that can not be completed in polynomial time, therefore it would be impossible to obtain p and q from n, if p and q are chosen to be large enough.

By means of a home made application of the cryptosystem this paper shows that the RSA is capable of ensuring both privacy and authenticity as long as p and q are large primes. The system is based on the assumptions that factoring n is the only way to obtain the private key and that factoring is infeasible in polynomial time, a proof of which escapes the scope of the essay.