S-Box Design: A Literature Survey

Research Comments from Ciphers By Ritter

Terry Ritter

Many block ciphers are based on the old Shannon idea of the sequential application of confusion and diffusion. Typically, confusion is provided by some form of substitution ("S-boxes"). So the obvious question is whether some substitutions are better than others. The obvious answer is "Yes," because one possible substitution maps every value onto itself, just as though there were no substitution at all.

So the hunt was on for measures which would distinguish between "bad" and "good" substitutions, and for techniques to construct "good" substitutions. But since weakness measures are related to attacks, new attacks often imply a need for new measures. And since we cannot know what attack an Opponent may use, how can we select a substitution which will defeat that attack?

Accordingly, this reviewer has a bias for randomly-selected and keyed S-boxes. While these cannot be expected to have optimum strength against whatever is being measured, they can be expected to have average strength against even unknown attacks: Where there is no systematic design, there can be no systematic weakness. And when S-boxes are chosen at random, everyone can be sure that no S-box "trap door" is present. Keying the S-boxes inevitably takes time, but some authors count this as an advantage in slowing attacks.

Here the story starts with Feistel who first describes the concept of "avalanche."


1973 -- Feistel

Feistel, H. 1973. Cryptography and Computer Privacy. Scientific American. 228(5): 15-23.

"There is growing concern that computers now constitute, or will soon constitute, a dangerous threat to individual privacy. Since many computers contain personal data and are accessible from distant terminals, they are viewed as an unexcelled means of assembling large amounts of information about an individual or a group. It is asserted that it will soon be feasible to compile dossiers in depth on an entire citizenry, where until recently the material for such dossiers was scattered in many separate locations under widely diverse jurisdictions. It will be argued here, however, that a computer system can be adapted to guard its contents from everyone but authorized individuals by enciphering the material in forms highly resistant to cipher-breaking." [p.15]

". . . let us take a fresh look at the basis of all cryptography: substitution on blocks of message digits. We shall refer to any cipher that converts n message digits into n cipher digits as a block cipher." [p.20]

"If we had a box with 128 inputs and outputs, for example, an analyst would have to cope with 2128 (or more than 1038) possible digit blocks, a number so vast that frequency analysis would no longer be feasible. Unfortunately a substitution device with 128 inputs would also require 2128 internal terminals between the first and second switch, a technological impossibility. This is a fundamental dilemma in cryptography. We know what would be ideal, but cannot achieve the ideal in practice." [p.20]

"As the input moves through successive layers the pattern of 1's generated is amplified and results in an unpredictable avalanche. In the end the final output will have, on average, half 0's and half 1's . . . ." [p.22]

"The important fact is that all output digits have potentially become very involved functions of all input digits." [p.22]

". . . the resulting cryptogram exhibits a sensitive intersymbol dependence that makes all output digits complicated functions not only of all message digits but also of all digits in the key." [p.22]

"It would be surprising if cryptography, the traditional means of ensuring confidentiality in communication, could not provide privacy for a community of data-bank users." [p.23]

1979 -- Kam and Davida

Kam, J. and G. Davida. 1979. Structured Design of Substitution-Permutation Encryption Networks. IEEE Transactions on Computers. C-28(10): 747-753.

"To guard against the know[n]-plaintext cryptanalytic attack, we can see intuitively that the following property is desirable for SP networks:

"For every possible key value, every output bit ci of the SP network depends upon all input bits p1,...,pn and not just a proper subset of the input bits." [p.748]

"Definition: Give a one-one correspondence f:{0,1}n to {0,1}n, f is said to be complete if, for every i,j in {1,...,n}, there exist two n-bit vectors X1, X2 such that X1 and X2 differ only in the ith bit and f(X1) differs from f(X2) in at least the jth bit." [p.749]

1982 -- Gordon and Retkin

Gordon, J. and H. Retkin. 1982. Are Big S-Boxes Best?

"Shannon [2] uses the term confusion to denote the process of substituting one byte for another. The current jargon is to describe a device which substitutes one byte for another according to a fixed table as a substitution box or S-box." [p.257]

"Design techniques for 'good' S-boxes are somewhat sparse in the open literature, and here we focus attention on the statistical properties of random, reversible S-boxes, and begin to answer the question how good an S-box do you get if you choose the contents as a random permutation of the set of all possible outputs. Preliminary work seems to show that a variety of desirable properties are likely to be found in such a randomly chosen S-box if the number of entries is large enough." [p.257]

"It is clear from the formulae, and intuitively satisfying that the most probable linearity is that one or more outputs are linear, this probability being given by

     2m(2m - 1)(2m-1!)2 / 2m!

"The overbound when m = 4 (as in FIPS-46) is around 1%, while when m = 8 the bound has reduced to 10-72." [p.260]

"Now that we know an overbound to the number of distinct, reversible (m,m) S-boxes with various numbers of linear bits we may extend this result to what we shall call partly-linear relationships." [p.261]

"Expressed as a probability of occurrence, we obtain

     2m(2m - 1)(2m-1!)2 / f * 2m!

". . . this is an overbound on the number of distinct, reversible (m,m) S-boxes for which a fraction f OR MORE of the entries have one or more output bits which are linear functions of input bits." [p.262]

1982 -- Ayoub

Ayoub, F. 1982. Probabilistic completeness of substitution-permutation encryption networks. IEE Proceedings E. 129(5): 195-199.

"It has been suggested that trapdoors (see Section 3) can be set in substitution-permutation (SP)-type encryption algorithms or networks [1], and potential users of new encryption algorithms might require a proof for their freedom from a deliberate trapdoor. Recent research has shown that, under certain conditions, the substitution function can be designed by a random choice, thus providing the necessary proof [2,3].

"It is shown in this paper that when the permutation is also selected at random, i.e. user keyed, the resulting network retains, with a very high probability, the completeness property, i.e. every output bit is a function of all input bits." [p.195]

1985 -- Webster and Tavares

Webster, A. and S. Tavares. 1985. On the Design of S-Boxes. Advances in Cryptology -- CRYPTO '85 523-534.

"The ideas of completeness and the avalanche effect were first introduced by Kam and Davida [1] and Feistel [2], respectively.


If a cryptographic transformation is complete, then each ciphertext bit must depend on all of the output bits. Thus, if it were possible to find the simplest Boolean expression for each ciphertext bit in terms of the plaintext bits, each of those expressions would have to contain all of the plaintext bits if the function was complete. Alternatively, if there is at least one pair of n-bit plaintext vectors X and Xi that differ only in bit i, and f(X) and f(Xi) differ at least in bit j for all { (i,j) | 1 <= i,j <= n } then the function f must be complete. [p.523]


"For a given transformation to exhibit the avalanche effect, an average of one half of the output bits should change whenever a single input bit is complemented. In order to determine whether a given m x n (m input bits and n output bits) function f satisfies this requirement, the 2m plaintext vectors must be divided into 2m-1 pairs, X and Xi, such that X and Xi differ only in bit i. Then the 2m-1 exclusive-or sums

     Vi = f(X) XOR f(Xi)
must be calculated. These exclusive-or sums will be referred to as avalanche vectors, each of which contains n bits, or avalanche variables.

"If this procedure is repeated for all i such that 1 <= i <= m, and one half of the avalanche variables are equal to 1 for each i, then the function f has a good avalanche effect. Of course this method can be pursued only if m is fairly small; otherwise, the number of plaintext vectors becomes too large. If that is the case then the best that can be done is to take a random sample of plaintext vectors X, and for each value of i calculate all the avalanche vectors Vi. If approximately one half the resulting avalanche variables are equal to 1 for all values of i, then we can conclude that the function has a good avalanche effect." [p.524]

The Strict Avalanche Criterion and the Independence of Avalanche Variables

"The concepts of completeness and the avalanche effect can be combined to define a new property which we shall call the strict avalanche criterion. If a cryptographic function is to satisfy the strict avalanche criterion, then each output bit should change with a probability of one half whenever a single input bit is complemented. A more precise definition of the criterion is as follows. Consider X and Xi, two n-bit binary plaintext vectors, such that X and Xi differ only in bit i, 1 <= i <= n. Let

     Vi = f(Y) XOR f(Yi)
where Y = f(X), Yi = f(Xi) and f is the cryptographic transformation under consideration. If f is to meet the strict avalanche criterion, the probability that each bit in Vi is equal to 1 should be one half over the set of all possible plaintext vectors X and Xi. This should be true for all values of i." [p.524]

"In the process of building these S-boxes, it was discovered that if an S-box is complete, or even perfect, its inverse function may not be complete. This could become important if these inverse functions are used in the decryption process, for it would be desirable for any changes in the ciphertext to affect all bits in the plaintext in a random fashion, especially if there is not much redundancy in the original plaintext. Complete cryptographic transformations with inverses which are complete are described as being two-way complete, and if the inverse is not complete the transformation is said to be only one-way complete." [p.529]

1988 -- Pieprzyk and Finkelstein

Pieprzyk, J. and G. Finkelstein. 1988. Towards effective nonlinear cryptosystem design. IEE Proceedings, Part E. 135(6): 325-335.


"It is well known [2], that any cryptographic system should be described by a nonlinear function. If operation of the system were expressible by a linear function, then, for a fixed key, the encryption function would be described by a matrix. It means that a linear cryptosystem may be broken without knowing the key applied." [p.325]

"The article addresses the following question. What is the natural limitation of nonlinearity of Boolean functions and permutations? To be able to answer, the distance between Boolean functions must be defined. Having the definition, we can express the nonlinearity as the distance between the function in question and the nearest linear function." [p.325]

Random Permutations

"It seems that we can get permutations of maximum nonlinearity by generation of permutations at random." [p.333]

". . . for n = 3 . . . the range of permutation nonlinearities is between 0 (permutations are linear) and 12 (permutations are of maximum nonlinearity) Among a total of 1000 trials, 408 random permutations are of the maximum nonlinearity." "Similar results have been obtained for n = 4. The probability of getting permutations of the maximum nonlinearity drops down slightly to 0.33. At the same time, it is very hard to obtain a linear and close to linear permutations at random." [p.334]

"For n = 5, although we have carried out 10,000 trials, all permutations have their nonlinearities inside the interval [74,104]. There is no permutation of the maximum nonlinearity equal to 120." [p.334]

"The situation is similar for n = 6, 7, 8, 9." [p.334]

1988 -- Forre

Forre, R. 1988. The Strict Avalanche Criterion: Spectral Properties of Boolean Functions and an Extended Definition. Advances in Cryptology -- CRYPTO '88. 450-468.


"A necessary and sufficient condition on the Walsh-spectrum of a boolean function is given, which implies that this function fulfills the Strict Avalanche Criterion." [p.450]


"It is worthwhile pointing out the fact that any function f' of n-1 bits will be a relatively bad approximation of f if f fulfills the SAC. Indeed, the output of the best possible f' will differ from the output of f with a probability of 1/4." [p.450]

1989 -- Meier and Staffelbach

Meier, W. and O. Staffelbach. 1989. Nonlinearity Criteria for Cryptographic Functions. Advances in Cryptology -- EUROCRYPT '89. 549-562.


"Nonlinearity criteria for Boolean functions are classified in view of their suitability for cryptographic design. The classification is set up in terms of the largest transformation group leaving a criterion invariant. In this respect two criteria turn out to be of special interest, the distance to linear structures and the distance to affine functions, which are shown to be invariant under all affine transformations. With regard to these criteria an optimum class of functions is considered. These functions simultaneously have maximum distance to affine functions and maximum distance to linear structures, as well as minimum correlation to affine functions. The functions with these properties are proved to coincide with certain functions known in combinatorial theory, where they are called bent functions."[p.549]

"With respect to linear structures, a function f has optimum nonlinearity if for every nonzero vector a in GF(2)n the values f(x+a) and f(x) are equal for exactly half of the arguments x in GF(2)n. If a function satisfies this property we call it perfect nonlinear with respect to linear structures, or briefly perfect nonlinear." [p.550]

". . . this notion of perfect nonlinearity is closely related to another design criterion for S-boxes, namely the strict avalanche criterion (SAC)." "Recall that a Boolean function satisfies SAC if the output changes with probability one half whenever a single bit is complemented. This means that a function satisfies SAC if the condition stated in the definition of perfect nonlinearity merely holds for vectors a of weight 1. Therefore perfect nonlinearity affects diffusion, and it is in fact a much stronger requirement than SAC. It is remarkable that in this context diffusion can be linked to nonlinearity." [p.550]

1989 -- Pieprzyk

Pieprzyk, J. 1989. Error propagation property and application in cryptography. IEE Proceedings, Part E. 136(4): 262-270.


"Shannon suggested [8] that any symmetric cryptosystem can be seen as a concatenation of many layers, any of which realizes either a transposition or a substitution. In practical implementations, however, transpositions are fixed and do not depend on cryptographic keys. On the other hand, substitutions are controlled by cryptographic keys and any key selects a suitable substitution (from now on called a permutation). Therefore, the design of symmetric cryptosystems consists of the appropriate selection of those permutations." [p.262]

"There is a common consensus that permutations used in cryptosystems should have several properties, since otherwise these cryptosystems are easily breakable [2, 3]. One of the required features is the error propagation property and its significance was identified by Feistel [4]. The property specifies that the cryptograms of messages from a close neighbourhood are dispersed over the whole cryptogram space. Its lack in cryptosystems means that a cryptanalyst, knowing a pair (message M, cryptogram C), can look for proper cryptograms in the neighbourhood of C instead of searching the whole cryptogram space, provided that the messages are close to the known message M.

"To describe the error propagation property of a Boolean function, Webster and Tavares [9] defined the strict avalanche criterion (SAC). A function satisfies the SAC if each of its output bits changes with a probability of one-half whenever a single [input] bit changes. Rejane Forre [5] showed how to identify Boolean functions that satisfy the SAC, knowing their Walsh spectra.

"In this work are introduced indicators of the error propagation property for both Boolean functions and permutations and [we] examine their natural boundaries." [p.262]

1989 -- Pieprzyk and Finkelstein

Pieprzyk, J. and G. Finkelstein. 1989. Permutations that Maximize Non-Linearity and Their Cryptographic Significance. Computer Security in the Age of Information. 63-74.


"This work is devoted to designing non-linear Boolean permutations. The first part deals with the notation of non-linearity and its properties. The second part addresses the problem of the generation of Boolean permutations in order to get the collection of non-linear Boolean functions. Finally, the examples of permutations of maximum non-linearity are given." [p.63]

1989 -- Pieprzyk

Pieprzyk, J. 1989. Non-linearity of Exponent Permutations. Advances in Cryptology -- EUROCRYPT '89. 80-92.


"This paper deals with an examination of exponent permutations with respect to their non-linearity. The first part gives the necessary background to be able to determine permutation non-linearity. The second examines the interrelation between non-linearity and Walsh transform. The next part summarizes results gathered while experimenting with different binary fields. In the last part of the work, we discuss the results obtained and questions which are still open." [p.80]

1990 -- Lloyd

Lloyd, S. 1990. Properties of binary functions. Advances in Cryptology -- EUROCRYPT '90. 124-139.


"In this paper, we shall investigate the connections between three properties of a binary function: the Strict Avalanche Criterion, balance and correlation immunity. The strict avalanche criterion was introduced by Webster and Tavares [7] in order to combine the ideas of completeness and the avalanche effect. A cryptographic transformation is said to be complete if each output bit depends on each input bit, and it exhibits the avalanche effect if an average of one half of the output bits change whenever a single input bit is changed. Forre [1] extended this notation by defining higher order Strict Avalanche Criteria. A function is balanced if, when all input vectors are equally likely, then all output vectors are equally likely. This is an important property for many types of cryptographic functions. The idea of correlation immunity is also extremely important, especially in the field of stream ciphers, where combining functions which are not correlation immune are vulnerable to ciphertext only attacks (see, for example [4]). The concept of mth order correlation immunity was introduced by Siegenthaler [5] as a measure of resistance against such an attack.

"In a previous paper [2], we found conditions under which a function satisfying the highest possible order Strict Avalanche Criterion was also balanced and/or correlation immune. Here we shall look at functions satisfying the next highest order Strict Avalanche Criterion. We shall also investigate higher orders of correlation immunity." [p.124]

1990 -- Preneel, Van Leekwijck, Van Linden, Govaerts and Vandewalle.

Preneel, B., W. Van Leekwijck, L. Van Linden, R. Govaerts and J. Vandewalle. 1990. Propagation Characteristics of Boolean Functions. Advances in Cryptology -- EUROCRYPT '90. 161-173.


"The relation between the Walsh-Hadamard transform and the autocorrelation function of Boolean functions is used to study propagation characteristics of these functions. The Strict Avalanche Criterion and the Perfect Nonlinearity Criterion are generalized in a Propagation Criterion of degree k." [p.161]

1 Introduction

"In the past [the] following criteria have been proposed: the function must have a high nonlinear order (no affine functions allowed), must be 0/1 balanced, complete, satisfy a strict avalanche criterion or be perfect non-linear (with respect to linear structures). These criteria can be extended by imposing the requirement that the functions created by fixing a number of input bits of the original function still satisfy certain criteria. A second extension is possible by not only specifying the average values but also the extreme values. It is clear that no function can satisfy all these criteria: a good function will be the golden mean." [p.161]

1991 -- Nyberg

Nyberg, K. 1991. Perfect nonlinear S-boxes. Advances in Cryptology -- EUROCRYPT '91. 378-386.


"A perfect nonlinear S-box is a substitution transformation with evenly distributed directional derivatives. Since the method of differential cryptanalysis presented by E. Biham and A. Shamir makes use of nonbalanced direction derivatives, the perfect nonlinear S-boxes are immune to this attack. The main result is that for a perfect nonlinear S-box the number of input variables is at least twice the number of output variables." [p.378]

1. Introduction

"In [12] Meier and Stafflebach discuss perfect nonlinear Boolean functions, which are defined to be at maximum distance from linear structures. These functions are the same as the previously known bent functions [15]. To construct perfect nonlinear S-boxes it is necessary that each output bit is a perfect nonlinear function of the input. But it is not sufficient, indeed, also every linear combination of output variables have to be perfect nonlinear. We present two different constructions to achieve this property." [p.378]

1991 -- Dawson and Tavares

Dawson, M. and S. Tavares. 1991. An Expanded Set of S-box Design Criteria Based on Information Theory and its Relation to Differential-Like Attacks. Advances in Cryptology -- EUROCRYPT '91. 353-367.


"In this work we present an expanded set of design criteria for creating good S-boxes based on information theoretic concepts and show that an S-Box that meets these criteria is immune to differential cryptanalysis [1]."

"We have defined a set of six properties that an Ideal S-box is required to meet. This set of properties has a broader scope than those of Forre and any S-box that meets these properties will also meet Forre's. The properties are grouped into a set of static properties and a set of dynamic properties."

Static Properties

"The first static property is that the partial information about the inputs and outputs does not reduce the uncertainty in an unknown output."

"The second static property is that the partial information about the inputs and outputs does not reduce the uncertainty in an unknown output."

"The third static property is that the uncertainty in a data value is reduced by the minimum amount possible when it passes through an S-box."

Dynamic Properties

"The dynamic properties are similar to the static properties except that they deal with the changes in inputs and outputs."

Analysis of DES S-boxes Using The Design Criteria

". . . we could not find S-boxes with substantially better information theoretic properties than the S-boxes of DES and which also meet the acknowledged DES design criteria." ". . . there were many S-boxes found which met the acknowledged DES design criteria but had poor information theoretic properties."

". . . the properties of the inverses of the DES 4x4 S-boxes were as good as those of the S-boxes themselves." ". . . the inverses of the DES 4x4 S-boxes meet the acknowledged DES design criterion which requires that at least two bits change in the output whenever one input bit is changed. These two discoveries indicate that the designers of DES placed an equal emphasis on the properties of the S-boxes and their inverses.

"In every case we found that the properties of the complete 6x4 S-boxes were better than any individual 4x4 sub-box. We concluded that using multiple sub-boxes to form a larger S-box is an important method which can be used to create S-boxes that have better properties than are possible in a single S-box."

". . . no nxn S-box can meet the Dynamic criteria perfectly because, due to the nature of the XOR function, output XOR values always occur in pairs (since a XOR b = b XOR a)."

1992 -- Sivabalan, Tavares and Peppard

Sivabalan, M, S. Tavares and L. Peppard. 1992. On the Design of SP Networks from an Information Theoretic Point of View. Advances in Cryptology -- CRYPTO '92. 260-279.

2. Evaluation Criteria for a Cryptographically Strong S-box

"Forre [9] presented a set of cryptographic properties of S-boxes based on information theory. Dawson & Tavares [10] extended Forre's ideas to define an expanded set of design criteria for cryptographically strong S-boxes. The authors viewed an S-box in two different ways: static view, which models an S-box when the inputs are steady, and dynamic view, which models an S-box when the inputs change. Forre's criteria, however, apply to the static model only. In the Dawson & Tavares' design framework both an S-box and its inverse were designed to have low information leakage. The expanded set of design criteria was developed at a "single" bit level, where information leakage between a single output bit ant the input bits or between a single output bit and the rest of the output bits were computed. We extend the design criteria to a "multiple" bit level, where information leakage between one or more output bits and the input bits or between one or more output bits and the rest of the output bits are considered." [p.261]

1992 -- Adams

Adams, C. 1992. On immunity against Biham and Shamir's "differential cryptanalysis." Information Processing Letters. 41: 77-80.

2. Avoiding differential cryptanalysis

"Differential cryptanalysis [5] is based on the fact that in many s-boxes certain input XORs (i.e., certain fixed changes in the s-box input vector) lead to certain output XORs (fixed changes in the s-box output vector) with fairly high probability ([about] 25%) and to certain other output XORs with very low or zero probability. Chosen plaintext attacks can be mounted which take advantage of the relatively high probabilities to reduce the search space for the key in use. It is obvious, therefore, that if all output XORs occurred with similar (ideally, equal) probability, differential cryptanalysis would have no greater chance of success than exhaustive search.

"We can design s-boxes with equiprobable output XORs through the use of bent functions ([10,14,2], and others)."

". . . the s-boxes described above cannot be n x n bijective s-boxes since columns in the representative matrix are bent and bent functions are not weight balanced. Therefore, SPN cryptosystems taking advantage of this work must be constructed such that it is never required to go 'backwards' through any of their component s-boxes."

1993 -- Cusick

Cusick, T. 1993. Boolean functions satisfying a higher order strict avalanche criterion. Advances in Cryptology -- EUROCRYPT '93. 102-117.

1. Introduction

"The Strict Avalanche Criterion (SAC) was introduced by Webster and Tavares [10] in connection with a study of the design of S-boxes; a Boolean function is said to satisfy the SAC if complementing a single input bit results in changing the output bit with probability one half. Forre [3] extended this concept by defining higher order strict avalanche criteria. A Boolean function on n variables satisfies the SAC of order k, 0 <= k <= n-2, if whenever k input bits are fixed arbitrarily, the resulting function of n-k variables satisfies the SAC. It is easy to see (Lloyd [5]) that if a function satisfies the SAC of order k > 0, then it also satisfies the SAC of order j for any j = 0, 1, ..., k-1. As is the case with any Boolean function criteria of cryptographic significance, it is of interest to count the functions which satisfy the criteria. A number of recent papers have dealt, wholly or in part, with counting functions that satisfy the SAC of various orders, for example, Lloyd [5, 6, 7] and Preneel et al. [8]. In all of these papers, when the number of variables is large only quadratic Boolean functions (that is, functions whose algebraic normal form contains only terms of degree <= 2) are counted. The simplest cases involve the functions satisfying the SAC of order n-2 or n-3; in these cases, no non-quadratic function can satisfy the criteria, so a complete count is obtained.

"The problem of counting the functions which satisfy the SAC of order <= n-4 is difficult, because many of the functions in these cases are non-quadratic. In this paper we apply some methods from group theory and combinatorics to give good estimates for the number of functions which satisfy the SAC of order n-4." [p.102]

1993 -- O'Connor

O'Connor, L. 1993. On the Distribution of Characteristics in Bijective Mappings. Advances in Cryptology -- EUROCRYPT 93. 360-370.

1 Introduction and Results

"Differential cryptanalysis is a statistical attack popularized by Biham and Shamir in a series of well-known papers [1, 2, 3]. The attack has been applied to a wide range of iterated mappings including LUCIFER, DES, FEAL, REDOC, Kahfre [4, 5, 12, 13, 17, 19]. As explained below, the attack is based on a quantity O called a characteristic, which has some probability pO of giving information about the secret key used in the mapping. The attack is universal in that characteristics O will always exist for any iterated mapping; however pO may be very small, and possibly less likely to furnish information concerning the key than the success of guessing the secret key at random. For this reason, differential cryptanalysis has had varying success against the iterated mappings it has been applied to, and little is known about how useful the attack is expected to be against an arbitrary iterated mapping."

"We will give a brief description of differential cryptanalysis with reference to product ciphers, though any iterated mapping would suffice. For a product cipher E that consists of R rounds, let Er(X,K) be the encryption of the plaintext X under the key K for r rounds, 1 <= r <= R. Note that ER(X,K) = E(X,K) = C is the ciphertext for X. Let dC(r) = Er(X,K) + Er(X',K) be the difference between the ciphertexts of two plaintexts X,X' after r rounds where 1 <= r <= R. For our purposes the difference operator + will refer to addition in the vector space Z2m. An r-round characteristic is defined as an (r+1)-tuple OR(dX,dY1, dY2,...,dYr) where dX is a plaintext difference, and the dYi are ciphertext differences. A plaintext pair X,X' of difference dX = X + X' is called a right pair with respect to a key K and a characteristic Or(dX,dY1, dY2,...,dYr) if when X and X' are encrypted, dC(i) = dYi for 1 <= i <= r. That is, X,X' is a right pair if the characteristic correctly predicts the ciphertext differences at each round. The characteristic Or has probability pOr if a fraction pOr of the plaintext pairs of difference dX are right pairs. On the other hand, if X,X' such that dX = X + X' is not a right pair, then it is said to be a wrong pair (with respect to the characteristic and the key). A table which records the number of pairs of difference dX that give the output difference dY for a mapping PI is called the XOR table distribution of PI. A characteristic dX,dY is said to be impossible for PI if its corresponding XOR table entry is zero. Also a characteristic will be called nonzero if w(dX),w(dY) > 0 , where w(.) is the Hamming weight function. Using a characteristic O of appropriate length it is then possible to devise a statistical experiment which when repeated a sufficient number of times will yield the subkey of the last round (see [1] for details)."

4 Conclusion and Remarks

"Our results then show that a relatively simple design can produce product ciphers for which all characteristics O are expected to (correctly) predict differences with low probability. We further note that random m-bit permutations can be generated efficiently [15], and that the fraction of permutations that are . . . linear [7] or degenerate [14] in any output bit is tending to zero rapidly as a function of m. On the other hand, Biham and Shamir [3] found that replacing the S-boxes of DES by random 4-bit permutations yielded systems that were far weaker than the original DES. The weakness of these S-boxes appears to be due to the dimension of the permutation rather than the use of [random] permutations per se."

1994 -- Daemen, Govaerts and Vandewalle

Daemen, J., R. Govaerts and J. Vandewalle. 1994. Correlation Matrices. Fast Software Encryption. Lecture Notes in Computer Science (LNCS) 1008. Springer-Verlag. 275-285.


"In this paper we introduce the correlation matrix of a Boolean mapping, a useful concept in demonstrating and proving properties of Boolean functions and mappings. It is argued that correlation matrices are the "natural" representation for the proper understanding and description of the mechanisms of linear cryptanalysis [4]. It is also shown that the difference propagation probabilities and the table consisting of the squared elements of the correlation matrix are linked by a scaled Walsh-Hadamard transform."

1 Introduction

"Most components in encryption schemes are Boolean mappings. In this paper, we establish a relation between Boolean mappings and specific linear mappings over real vector spaces. The matrices consist of the correlation coefficients associated with linear combinations of input bits and linear combinations of output bits."

1995 -- Youssef, Tavares, Mister and Adams

Youssef, A., S. Tavares, S. Mister and C. Adams. 1995. Linear Approximation of Injective S-boxes. IEE Electronics Letters. 31(25): 2168-2169.


"Nonlinearity is a crucial requirement for the substitution boxes in secure block ciphers. In this letter, we derive an estimate for the expected nonlinearity of a randomly selected injective substitution box."


"Differential cryptanalysis [1] and linear cryptanalysis [2] are powerful cryptanalytic attacks on private-key block ciphers. The complexity of differential cryptanalysis depends on the size of the largest entry in the XOR table, the total number of zeros in the XOR table, and the number of nonzero entries in the first column of that table [1], [3]. The complexity of linear cryptanalysis depends on the size of the largest entry in the linear approximation table (LAT)[2].

"One way to reduce the size of the largest entry in the XOR table is to use injective substitution boxes (s-boxes) such that the number of output bits of the s-box is sufficiently larger than the number of input bits. In this way, it is very likely that the entries in the XOR distribution table of a randomly chosen injective s-box will have only small values, making the block cipher resistant to differential cryptanalysis. Some proposed block ciphers, such as CAST [4] and Blowfish [5], take advantage of this property.

"On the other hand, Biham [6] proved that if for an nxm s-box described by f: Z2n -> Z2m we have m >= 2n - n, then at least one linear combination of the output bits must be an affine combination of the input bits and the block cipher can be trivially broken by linear cryptanalysis. In this letter, we estimate the size of the largest entry in the LAT of a randomly selected injective s-box."

1995 -- Youssef and Tavares

Youssef, A., S. Tavares. 1995. Resistance of Balanced S-boxes to Linear and Differential Cryptanalysis. Information Processing Letters. 56: 249-252.


"In this letter, we study the marginal density of the XOR distribution table, and the linear approximation table entries of regular substitution boxes (s-boxes). Based on this, we show that the fraction of good s-boxes (with regard to immunity against linear and differential cryptanalysis) increases dramatically with the number of input variables."


"Differential cryptanalysis [1], and linear cryptanalysis [3] are currently the most powerful cryptanalytic attacks on private-key block ciphers. The complexity of differential cryptanalysis depends on the size of the largest entry in the XOR table, the total number of zeros in the XOR table, and the number of nonzero entries in the first column in that table [1], [8]. The complexity of linear cryptanalysis depends upon the size of the largest entry in the linear approximation table (LAT).

"One requirement in s-box design is to have a balanced s-box (also known as a regular s-box). This means that each output symbol should appear an equal number of times when the input is varied aver all possible values.

"Gordon and Retkin calculated the probability that one or more of the output coordinates of a random, reversible s-box is an affine function. By showing that this probability decreases dramatically with the number of input variables, they conjectured that larger s-boxes are better. In this letter, we provide further evidence for their conjecture by showing that the fraction of good s-boxes, with regard to immunity against linear and differential cryptanalysis, increases dramatically with the number of input variables."

1995 -- Youssef and Tavares

Youssef, A., S. Tavares. 1995. Number of Nonlinear Regular S-boxes. IEE Electronics Letters. 31(19): 1643-1644.


"Nonlinearity is a crucial requirement for the substitution boxes in secure block ciphers. In this letter, we calculate the probability of linearity in any nonzero linear combination of the output coordinates of a randomly selected regular substitution box."


"Gordon and Retkin [1] calculated the probability that any of the output coordinates of a random, reversible substitution box (i.e., a permutation) is an affine function. After both differential cryptanalysis [2], and linear cryptanalysis [3] were introduced, it was realized that the cryptographic strength of a multi-output function depends not only on the strength of the individual output coordinates but also on the strength of every nonzero linear combination of those coordinates [4].

"One requirement in substitution-box (s-box) design is to have a regular s-box (also known as a balanced s-box). This means that each output symbol should appear an equal number of times when the input is varied over all possible values.

"In this letter, we calculate the probability that any nonzero linear combination of the output coordinates of a regular s-box is an affine function."

". . . linear regular s-boxes, i.e., regular s-boxes with the property that one or more of the nonzero linear combination of its output coordinates is affine . . . as a fraction of the total number of regular s-boxes [is] denoted by FRL(n,m) . . . ."

"One can easily get an upper bound for FRL(n,m) . . .

    FRL(n,m) < O(25n/2 / 22n)

[Your reviewer calculates: FRL(8,8) < 220 / 2 256 = 2 -236 ]


"We have derived an exact expression for the number of regular s-boxes with the property that one or more nonzero linear combinations of its output coordinates is affine. From the above, it is clear that this fraction decreases dramatically with the number of inputs."

1995 -- Youssef and Tavares

Youssef, A., S. Tavares. 1995. Information Leakage of a Randomly Selected Boolean Function. Proceedings of the 4th Canadian Workshop on Information Theory. Lecture Notes in Computer Sciences (LNCS) 1133. Springer-Verlag. 41-52.


"It is argued that a boolean function f: Z2n -> Z2m is resistant to statistical analysis if there is no significant static and dynamic leakage between its inputs and outputs. In this paper, we derive expressions for the expected value of the information leakage of randomly selected boolean functions and for the interesting cases of randomly selected balanced, and randomly selected injective boolean functions. It is shown that the expected value of different forms of information leakage decreases dramatically with the number of input variables n. For example, for a single output boolean function, we show that the expected value of different forms of leakage goes down exponentially with n."


Gordon and Retkin [9] conjectured that good substitution boxes (S-boxes) may be built by choosing a random reversible mapping of sufficient size. Their argument is based on the observation that the probability of accidental linearity occurring in such S-boxes decreases dramatically as the size of the S-box increases. Here in this paper, we provide further evidence that bigger S-boxes (by bigger we mean S-boxes with a larger number of inputs) are better by showing that the expected value of information leakage of a randomly selected boolean function decreases rapidly with the number of input variables."

1995 -- Zhang and Zheng

Zhang, X. and Y. Zheng. 1995. GAC -- the Criterion for Global Avalanche Characteristics of Cryptographic Functions. Journal for Universal Computer Science. 1(5): 316-333.


"We show that some widely accepted criteria for cryptographic functions, including the strict avalanche criterion (SAC) and the propagation criterion, have various limitations in capturing properties of vital importance to cryptographic algorithms, and propose a new criterion called GAC to measure the global avalanche characteristics of cryptographic functions."

1 Why the GAC

"In 1985, Webster and Tavares introduced the concept of the strict avalanche criterion (SAC) when searching for principles for designing DES-like data encryption algorithms [23, 24]. A function is said to satisfy the SAC if complementing a single bit results in the output of the function being complemented with a probability of a half. More formally, let Vn denote the vector space of n tuples of elements from GF(2), a function f on Vn , a mapping from Vn into GF(2), is said to satisfy the SAC if for any n-bit vector a with W(a) = 1, where W(.) denotes the Hamming weight, f(x) + f(x+a) assumes the value zero and one an equal number of times, namely f(x) + f(x+a) is a balanced function on Vn , where + denotes the addition in GF(2).

"The SAC was generalized in one direction by Forre in [7]. Forre defines that a function f satisfies the SAC of order k if a partial function obtained by keeping any k input bits to f constant still satisfies the SAC." "In another direction, the SAC has been generalized by Adams and Tavares [1] and independently by Preneel et al [16] to what is now called the propagation criterion. A function f on Vn is said to satisfy the propagation criterion with respect to vector a in Vn if f(x) + f(x+a) is balanced, and to satisfy the propagation criterion of degree k if it satisfies the propagation criterion with respect to all nonzero vectors whose Hamming weight is at most k. In informal terms, f satisfies the propagation criterion of degree k if complementing k or less bits results in the output of f being complemented with a probability of a half. We note that functions satisfying the propagation criterion of degree n coincide with bent functions, an important combinatorial structure discovered by Rothaus [17]."

"Given a function f on Vn and a vector a in Vn , the vector is said to be a linear structure of f if f(x) + f(x+a) is a constant. An affine function f(x) = a1x1 + ... + anxn + c, where aj , c in GF(2), j = 1,2,...,n, has all the vectors in Vn as its linear structures."

1995 -- Vaudenay

Vaudenay, S. 1995. An Experiment on DES Statistical Cryptanalysis.

"Abstract. Linear cryptanalysis and differential cryptanalysis are the most important methods of attack against block ciphers. Their efficiency have been demonstrated against several ciphers, including the Data Encryption Standard. We prove that both of them can be considered, improved and joined in a more general statistical framework. We also show that the very same results as those obtained in the case of DES can be found without any linear analysis and we slightly improve them into an attack with theoretical complexity 242.9.

"We can apply another statistical attack -- the X2-cryptanalysis -- on the same characteristics without a definite idea of what happens in the encryption process. It appears to be roughly as efficient as both differential and linear cryptanalysis."

"The success of those methods have focused the attention on the linear properties of the boxes. In this paper, we try to prove that the linear properties are not so important."

Terry Ritter, his current address, and his top page.

Last updated: 1997-09-25