Perhaps the best introduction to Linear Cryptanalysis is the original article by Matsui, although continued development can be expected to have changed this original approach somewhat.
Apparently, Linear Cryptanalysis starts by finding approximate linear expressions for S-boxes, then extends these expressions to the entire cipher. Clearly, if the expressions were precisely linear, known-plaintext could immediately be "solved" for key bits.
As I understand it, since the expressions are only approximate, in each expression a particular value for a key bit may only be slightly more probable than its complement. Accordingly, considerable known-plaintext is required before key bit values are clearly indicated.
The question for new cipher designs is whether we can ever prove that no approximate linear expression exists which is sufficiently effective as to expose the key. One answer to this is to key the S-boxes, thus depriving the analyst of precise knowledge of their contents which means that they cannot be reasonably approximated.
Matsui, M. 1993. Linear Cryptanalysis Method for DES Cipher. Advances in Cryptology -- EUROCRYPT '93. 386-397.
"We introduce a new method for cryptanalysis of DES cipher, which is essentially a known-plaintext attack. As a result, it is possible to break 8-round DES cipher with 221 known-plaintexts and 16-round DES cipher with 247 known-plaintexts."
"In this paper we introduce an essentially known-plaintext attack of DES cipher. The purpose of this method is to obtain a linear approximate expression of a given cipher algorithm. For this purpose, we begin by constructing a statistical linear path between input and output bits of each S-box. Then we extend this path to the entire algorithm, and finally reach a linear approximate expression without any intermediate value."
"Generally speaking, there exist many linear approximate expressions for a given cipher algorithm. Moreover, if plaintexts are not random, we may even find an expression which has no plaintext bits in it. This suggests that our method finally leads to an only-ciphertext attack."
Matsui, M. and A. Yamagishi. 1994. A New Cryptanalytic Method for FEAL Cipher. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Science. E77-A(1): 2-7.
"In this paper, we propose a new technique of a known plaintext attack of FEAL cipher. Our method is a kind of meet-in-the-middle attack with a partial exhaustive key search, and therefore we can derive all possible key candidates directly and deterministicly. In other words, it enables us to specify all candidates for secret key K which satisfies Encryption(P.K) = (C) for any of given plaintexts P and the corresponding ciphertext C.
"To realize this, we newly introduce a 'checking function' and a 'cutting off method.' The former is a function g(P,C,K) whose value is constant if and only if the key candidate K satisfies Encryption(P.K) = (C) for any plaintext P and the corresponding ciphertext C, and the latter is a technique to reduce the number of key candidates K."
Matsui, M. 1994. The First Experimental Cryptanalysis of the Data Encryption Standard. Advances in Cryptology -- CRYPTO '94. 1-11.
"In the first paper on linear cryptanalysis , we introduced a new measure of linearity of S-boxes and extended it to the entire cipher structure of DES."
"This paper studies an improved version of linear cryptanalysis and its application to the first successful computer experiment in breaking the full 16-round DES. We newly introduce two viewpoints; linear approximate equations based on the best (n-2)-round expression, and reliability of the key candidates derived from these equations. The former reduces the number of required plaintexts, whereas the latter increases the success rate of our attack.
In the 247-method, we established two linear approximate equations of 16-round DES using the best 15-round expression, where each equation includes one active S-box and hence recovers 7 secret key bits. This paper, however, begins with two new linear approximate equations derived from the best 14-round expression, where each equation has two active S-boxes and can recover 13 secret key bits. These equations give us, therefore, a total of 26 secret key bits, and then the remaining 56 - 26 = 30 secret key bits are within the reach of an exhaustive search."
"As a result, DES is breakable with complexity 243 and success rate 85% if 243 known-plaintexts are available. For another example, success rate is 10% with complexity 250 if 238 known-plaintexts are available.
"We carried out the first experimental attack of the full 16-round DES using twelve computers (HP9735/PA-RISK 99MHz) to confirm this scenario. The program, described in C and assembly languages consisting of a total of 1000 lines, was designed to solve two equations while generating 243 random plaintexts and enciphering them. We finally reached all of the 56 secret key bits in fifty days, out of which forty days were spent for generating plaintexts and their ciphertexts and only ten days were spent for the actual key search."
Daemen, J., R. Govaerts and J. Vandewalle. 1995. Correlation Matrices. Fast Software Encryption. Lecture Notes in Computer Science (LNCS) 1008. Springer-Verlag. 275-285.
"In this paper we introduce the correlation matrix of a Boolean mapping, a useful concept in demonstrating and proving properties of Boolean functions and mappings. It is argued that correlation matrices are the "natural" representation for the proper understanding and description of the mechanisms of linear cryptanalysis . It is also shown that the difference propagation probabilities and the table consisting of the squared elements of the correlation matrix are linked by a scaled Walsh-Hadamard transform."
"Most components in encryption schemes are Boolean mappings. In this paper, we establish a relation between Boolean mappings and specific linear mappings over real vector spaces. The matrices consist of the correlation coefficients associated with linear combinations of input bits and linear combinations of output bits."
Kaliski, B. and M. Robshaw. 1994. Linear Cryptanalysis Using Multiple Approximations. Advances in Cryptology -- CRYPTO '94. 26-39.
"Matsui and Yamagishi  introduced the idea of linear cryptanalysis in 1992 on an attack on FEAL . The techniques used in this attack were refined by Matsui and used with dramatic effect on DES  in a theoretical attack on the full 16-round DES requiring 247 known plaintext / ciphertext pairs . After further work an experiment was performed during which the key used in the full 16-round version of DES was recovered using 243 known plaintext / ciphertext pairs .
"The most notable feature about linear cryptanalysis is that it is a known plaintext attack rather than a chosen plaintext attack (differential cryptanalysis  is a chosen plaintext attack) and as such poses more of a practical threat to a block cipher. At present, however, a successful linear cryptanalytic attack on DES still requires a large quantity of known plaintext.
"In this paper we consider an extension to the linear cryptanalytic attack [4, 5] using multiple linear approximations. This offers a slight improvement in the efficiency of an attack on the DES but more importantly, it is generally applicable and in certain circumstances it might well be extremely effective in reducing the amount of data required by a cryptanalyst for a successful attack on a block cipher using linear cryptanalysis."
Youssef, A., S. Tavares, S. Mister and C. Adams. 1995. Linear Approximation of Injective S-boxes. IEE Electronics Letters. 31(25): 2168-2169.
"Nonlinearity is a crucial requirement for the substitution boxes in secure block ciphers. In this letter, we derive an estimate for the expected nonlinearity of a randomly selected injective substitution box."
"Diferential cryptanalysis  and linear cryptanalysis  are powerful cryptanalytic attacks on private-key block ciphers. The complexity of differential cryptanalysis depends on the size of the largest entry in the XOR table, the total number of zeros in the XOR table, and the number of nonzero entries in the first column of that table , . The complexity of linear cryptanalysis depends on the size of the largest entry in the linear approximation table (LAT).
"One way to reduce the size of the largest entry in the XOR table is to use injective substitution boxes (s-boxes) such that the number of output bits of the s-box is sufficiently larger than the number of input bits. In this way, it is very likely that the entries in the XOR distribution table of a randomly chosen injective s-box will have only small values, making the block cipher resistant to differential cryptanalysis. Some proposed block ciphers, such as CAST  and Blowfish , take advantage of this property.
"On the other hand, Biham  proved that if for an nxm s-box described by f: Z2n -> Z2m we have m >= 2n - n, then at least one linear combination of the output bits must be an affine combination of the input bits and the block cipher can be trivially broken by linear cryptanalysis. In this letter, we estimate the size of the largest entry in the LAT of a randomly selected injective s-box."
Youssef, A., S. Tavares. 1995. Resistance of Balanced S-boxes to Linear and Differential Cryptanalysis. Information Processing Letters. 56: 249-252.
"In this letter, we study the marginal density of the XOR distribution table, and the linear approximation table entries of regular substitution boxes (s-boxes). Based on this, we show that the fraction of good s-boxes (with regard to immunity against linear and differential cryptanalysis) increases dramatically with the number of input variables."
"Differential cryptanalysis , and linear cryptanalysis  are currently the most powerful cryptanalytic attacks on private-key block ciphers. The complexity of differential cryptanalysis depends on the size of the largest entry in the XOR table, the total number of zeros in the XOR table, and the number of nonzero entries in the first column in that table , . The complexity of linear cryptanalysis depends upon the size of the largest entry in the linear approximation table (LAT).
"One rerquirement in s-box design is to have a balanced s-box (also known as a regular s-box). This means that each output symbol should appear an equal number of times when the input is varied aver all possible values.
"Gordon and Retkin calculated the probability that one or more of the output coordinates of a random, reversible s-box is an affine function. By showing that this probability decreases dramatically with the number of input variables, they conjectured that larger s-boxes are better. In this letter, we provide further evidence for their conjecture by showing that the fraction of good s-boxes, with regard to immunity against linear and differential cryptanalysis, increases dramatically with the number of input variables."
Vaudenay, S. 1995. An Experiment on DES Statistical Cryptanalysis.
"Linear cryptanalysis and differential cryptanalysis are the most important methods of attack against block ciphers. Their efficiency have been demonstrated against several ciphers, including the Data Encryption Standard. We prove that both of them can be considered, improved and joined in a more general statistical framework. We also show that the very same results as those obtained in the case of DES can be found without any linear analysis and we slightly improve them into an attack with theoretical complexity 242.9.
"We can apply another statistical attack -- the X2-cryptanalysis -- on the same characteristics without a definite idea of what happens in the encryption process. It appears to be roughly as efficient as both differential and linear cryptanalysis."
"The success of those methods have focused the attention on the linear properties of the boxes. In this paper, we try to prove that the linear properties are not so important."
Harpes, C., G. Kramer and J. Massey. 1995. A Generalization of Linear Cryptanalysis and the Applicability of Matsui's Piling-up Lemma. Advances in Cryptology -- EUROCRYPT '95. 24-38.
"Matsui's linear cryptanalysis for iterated block ciphers is generalized by replacing his linear expressions with I/O sums. For a single round, an I/O sum is the XOR of a balanced binary-valued function of the round input and a balanced binary-valued function of the round output." "A cipher contrived to be secure against linear cryptanalysis but vulnerable to this generalization of linear cryptanalysis is given. Finally, it is argued that the ciphers IDEA and SAFER K-64 are secure against this generalization."
Buttyan, L. and I Vajda. 1995. Searching for the best linear approximation of DES-like cryptosystems. Electronics Letters. 31(11): 873-874.
"The authors show that the problem of searching for the best characteristic in linear cryptanalysis is equivalent to searching for the maximal weight path in a directed graph."
"We implemented the algorithm with type 1 restriction for a DES cryptosystem on a personal computer. We obtained the same best linear expressions as Matsui  in a few minutes. If better approximations were required, we should approximate s > 1 S-boxes in round function F. Unfortunately, this leads to rapid growth in the graph." "Even for s = 2, the direct implementation of the algorithm needs [about] 1012 bytes of memory, which makes it infeasible. This kind of problem means that there is a feasibility limit to such cryptanalysis in general. Suboptimal solutions could be expected by applying sequential decoding type algorithms for searching characteristics better than type 1 optimums."
Last updated: 1997-09-25