Ritter's Cipher Boutique

Original architectures for custom ciphers.

Here you can select a base cipher and features for a custom cipher. Our existing cipher products can be modified for your needs, and our cipher prototypes can be expanded into full-blown custom ciphers in many different ways.


Contents


CIPHER SPEED

Within a given technology, a software cipher must to some extent trade off between speed and strength. But improved technology can provide better tradeoffs.

Ciphering speed is very important for any sort of centralized resource like LAN or network servers. Naturally, users who communicate with a LAN must use the cipher the LAN uses, even if cipher speed is not important to those users.

Speed is relatively easy to measure, but speed values vary with processor, clock speed, cache, operating system, code optimization, in-memory vs. disk ciphering, etc. (Even cycle-counts will differ on different-generation processors.) To understand the advantages of different techniques, we need to compare "apples with apples" in groups of similar "apples." These measurements are for file enciphering on a RAM disk, and thus include operating system overhead for file open, close, and data read/write.

[Graph: Encipher Speed for Developed Products]
Among our "more developed" implementations, DES, Triple-DES and Dagger are probably the most highly optimized. Penknife and Cloak2 are less optimal, partly because they are complex, full cryptosystems. (Both include an "enciphered batch" level, user keys in an alias file which initialize the data cipher, wildcard ciphering, etc.) For example, both Penknife and Cloak2 perform a separate error-check CRC pass over the plaintext; this takes time, but then the user knows when the file has been deciphered properly. Penknife also saves ciphertext as ASCII "lines," and has error-resilient data structures and algorithms, which further reduce speed. The Penknife and Cloak2 ciphers measure comparable to DES because real cryptosystems do more than just cipher data.

[Graph: Encipher Speed: Fencing & Mixing Prototypes]
In contrast, the Fencing and Mixing prototypes are in a new, raw, only partially optimized state. The ability to support large blocks, large keys, and ciphering based on existing ciphers (Fenced DES uses ordinary DES, for example) are advantages not found in most other ciphers. Again, this is actual file-to-file ciphering under DOS 6.21.

[Graph: Encipher Speed: Variable Size Block Prototypes]
The Variable Size Block Ciphers are an even more unique category. Not only are these implementations new and raw, but the designs include features which are going to be used in a real system anyway. As measured, these ciphers include:

The VSBC designs do more, and would be better compared to packages with similar features, but some of these features simply will not be present in any other implementation.


CIPHER STRENGTH

Cryptography has no way to measure the strength of a cipher. Since nobody can measure cipher strength, we are left to argue strength. To provide an honest basis for discussion, we disclose the full details of our basic cipher designs.

[Graph: Arguable Cipher Strength in Bits, with
Triple-DES: 168, DES: 56, Penknife: 63, Cloak2: 992,
Experimental VSBC: 992, Dagger: 144]

Cipher "strength" would seem to be the entire reason for using a cipher, but here more is not necessarily better. Once beyond the effort which an Opponent can possibly apply, the strength is probably "good enough." The problem is that we can never really know how easy it will be to break a cipher, given advanced cryptoanalytic knowledge which will always be secret. A real cryptanalyst is not going to announce when a cipher is broken, for it is only then that the investment in the work starts paying off.

We can make the cryptography war more difficult for the cryptanalyst, by giving our organizations or groups their own custom cipher. Far more than simply adding a few bits to the keyspace, each custom cipher must be attacked separately. Since much less information will be protected by a custom cipher than, say, DES, there will be much less motive for such an attack, and fewer resources available for it. Custom ciphers give the cryptographer a way to fight against the advantage the cryptanalyst normally holds.

Right now, 56 bits is too weak for new designs, and at least 90 bits has recently been recommended. However, that group apparently did not discuss the effect that "molecular computation" could have in as little as five years. Accordingly, we would now like to see 112 or 120 bits as a reasonable minimum size key. (These are secret key sizes: A public key must be something like 10 times as large as a secret key for similar strength.) We tend to use much larger keys in our designs, because storing large keys is a minimal cost, and because large keys can make for straightforward and unambiguous designs.

Of course, no cipher can hide something which is not otherwise secret. The simple use of a cipher program will not, by itself, solve all data security problems.


STREAM CIPHERS

Stream ciphers deal with bits or bytes of the message at a time. In the Classical stream cipher, each byte is ciphered independent of any other byte. Modern forms cause each byte in the message to "affect" subsequent data bytes, but no true stream cipher can affect earlier bytes.

Basically, a stream cipher has two components: a random number generator (RNG), and a combiner. The RNG produces a confusion sequence which the combiner mixes with the plaintext message to produce ciphertext. The original plaintext is recovered by mixing the exact same confusion sequence with the ciphertext in an inverse combiner or extractor.

In our own designs we typically use an Additive RNG (see the 1991 RNG survey article (168K), especially Section 4.10) because we can make these very large and very fast. (We have also used conventional congruential and shift-register generators -- as in Penknife -- and tables -- as in Dagger.) We can vary the running size, the particular primitive polynomial, the nonlinear filtering, the intermediate key-expansion RNG's, and the CRC's used to process the User Key. Any of these changes will essentially produce a different cipher.

In our designs we typically use Dynamic Substitution combiners, which are one of our major technologies because they add strength and prevent known-plaintext attack. We can vary the number of combiner levels, the number of combiners in each level, and the initialization; each variation would be a different cipher.

We typically create a random message key for every message, and then encipher the message itself under the random message key.


Penknife

Penknife is a finished e-mail stream cipher product with many features. Penknife can be customized, or some of its features applied to other ciphers.

In particular, Penknife could be made stronger, especially if the original emphasis on error-resilience could be relaxed.


Cloak2

Cloak2 is a finished strong file stream cipher product with many features. It can also be customized.

One good extension would be to strengthen the message key cipher so that message keys for individual messages could be extracted and disclosed without revealing the message keys in all messages. This would allow an organization to comply with any sort of message disclosure without weakening their overall security.


Dagger

Dagger is a finished fast stream cipher engine which can be customized.

Certainly, most applications do not need the generality built into Dagger. This means that the cipher can be strengthened in most applications.


BLOCK CIPHERS

Block ciphers are mechanisms which emulate a keyed Simple Substitution of a size which is far too large to explicitly describe. Block ciphers process each message in blocks or "chunks," but even if only one byte of a block is needed, the entire block must be ciphered and transported. Ciphering cannot occur until a block is filled.

In an ideal block cipher, each bit of the plaintext should "affect" each bit of the ciphertext, and changing any input bit should change about half of the output bits. This means that all of the uniqueness in each block of plaintext is captured to increase the complexity of the cipher. In this respect, obviously, large blocks are better, but most conventional block ciphers use tiny 8-byte blocks.


Fencing and Mixing Block Ciphers

Technical Terms


Fenced DES

(init = 31ms, ciphering = 181 KB/sec)


                              INPUT
 <------------------------- 256 bits -------------------------->

 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 -------------------------------x-------------------------------  mixing
 ---------------x--------------- ---------------x---------------  mixing
 ------DES------ ------DES------ ------DES------ ------DES------  DES
 ---------------x--------------- ---------------x---------------  mixing
 -------------------------------x-------------------------------  mixing
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing

 <------------------------- 256 bits -------------------------->
                             OUTPUT

Here we have 32 input and 32 output byte-substitutions (S) across a 256-bit block. We also mix (---x---) 128-bit and 64-bit sub-blocks using linear Balanced Block Mixing. Because all data flows through DES, the cipher cannot be weaker than DES. Each substitution or layer is protected by DES and so apparently cannot be exposed without breaking DES and the other substitution layer.

Each 256-byte substitution table is keyed by shuffling under a cryptographic RNG initialized from a User Key. That RNG also produces 4 separate random DES keys. Normally, the keyspace of this sort of cipher is limited by the size of the RNG used to key the substitutions, and it is easy to have a true keyspace of thousands of bits.

The ability to attack the keying RNG depends upon developing the state in one or more of the substitutions, then inferring the RNG sequence. But inferring the RNG sequence can be made difficult or impossible by double-shuffling each substitution. It is not at all clear how an attacker could develop the correct state of any substitution in the first place. Even a single bit error in any input table is guaranteed to produce avalanche, so the extent of solution of these tables cannot be distinguished. Fenced DES was described on sci.crypt in early 1994.


Fenced Tree

(init = 45ms, ciphering = 113 KB/sec)


 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 -------------------------------x-------------------------------  mixing
 ---------------x--------------- ---------------x---------------  mixing
 -------x------- -------x------- -------x------- -------x-------  mixing
 ---x--- ---x--- ---x--- ---x--- ---x--- ---x--- ---x--- ---x---  mixing
 -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x-  mixing
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x- -x-  mixing
 ---x--- ---x--- ---x--- ---x--- ---x--- ---x--- ---x--- ---x---  mixing
 -------x------- -------x------- -------x------- -------x-------  mixing
 ---------------x--------------- ---------------x---------------  mixing
 -------------------------------x-------------------------------  mixing
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing

This is similar to Fenced DES, but with three more Balanced Block Mixing layers (for a total of five) and another fencing layer instead of the DES layer.


Fenced Quad

(init = 45ms, ciphering = 398 KB/sec)


 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 ===============================================================  lin-FFT
 ======= ======= ======= ======= ======= ======= ======= =======  lin-FFT
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 ======= ======= ======= ======= ======= ======= ======= =======  lin-FFT
 ===============================================================  lin-FFT
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing

Here we have two full linear "FFT" operations between three fencing layers. Note that many different mixing arrangements will produce equally acceptable mixing.

Each "FFT" "butterfly" operation is a Balanced Block Mixer of appropriate width. Basically, this design assumes that mixing with 32-bit sub-blocks is likely to be faster than mixing with 8-bit sub-blocks. Consequently, the mixing is broken down into 32-bit "FFT" layers and 8-bit "FFT" layers.


Fenced FFT

(init = 45ms, ciphering = 413 KB/sec)


 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 ===============================================================  lin-FFT
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing
 ===============================================================  lin-FFT
 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S  fencing

Here we also have linear "FFT-like" mixing between three fencing layers. But here, every "FFT" "butterfly" operation is a byte width Balanced Block Mixer.


Fenced OLS

(init = 76ms, ciphering = 372 KB/sec)


                             INPUT
 <------------------------- 256 bits -------------------------->

 ===============================================================  nl-FFT

 <------------------------- 256 bits -------------------------->
                             OUTPUT

Here we have non-linear FFT-style mixing, and no fencing layers at all. Each "FFT" "butterfly" is a byte-width Balanced Block Mixer composed of a keyed pair of orthogonal Latin squares of order 256.

This prototype uses a single 128K randomized table, with a keyspace of 3368 bits. But only memory space and initialization time prevents us from using a different table at every mixing node, thus producing an astronomical keyspace.


Variable Size Block Ciphers

The usual block cipher has a small fixed width (often 64 bits). But if plaintext really does contain uniqueness at a rate of about one bit per character, the usual block cipher covers only about eight bits of uniqueness, a searchable quantity. (This sort of attack, of course, would require a deep knowledge of language structure, but doing "Triple" anything would not affect it.) By increasing block size dramatically, we can have a cipher which is not vulnerable to this sort of attack.

The usual stream cipher can cipher sections of arbitrary size, but can only propagate uniqueness to later ciphertext. In contrast, a block cipher -- including all of these VSBC designs -- will propagate plaintext uniqueness to all bits of ciphertext in the same block.

The Variable Size Block Cipher (VSBC) concept is a new paradigm, and can be used in ways far different than fixed size block ciphers. For example, in a VSBC there is no need to use fixed-size blocks: A message can be ciphered in blocks of pseudo-random size. This means that an Opponent cannot even know what data belongs to a single block.

Here I show the structure for typically three adjacent bytes, but each of these designs is used in a file cipher which dynamically selects block size to the byte with an average size of about 2 KB.

These VSBC designs have a very similar structure for both enciphering and deciphering. Data flows down for enciphering, and up for deciphering. The tables (or Latin square rows) used for deciphering are the inverse of those used for enciphering. In the SBBM and OLs designs, the diffusion direction also changes.

In the two designs which use substitution tables, the tables are numbered to remind us that each is generally different. Nor is the table for any particular position fixed: Tables are used as needed from an array of 64 such tables. Each table is separately shuffled by the cryptographic RNG and each table represents a potential 1648-bit key. Normally, the keyspace of this sort of cipher is limited by the size of the RNG used to key the tables or squares, and it is easy to have an efficient true keyspace of many thousands of bits.

In marked contrast to other cipher designs, additional confusion and diffusion layers are easily and modularly added to produce a cipher of any desired working strength.

Performance measurements occurred for RAM-drive ciphering of a 750K file on a P90 under DOS 6.22, with single-pass shuffles.


VSBC Substitution / XOR (SubX)

(init = 15ms, ciphering = 461 KB/sec)

              INPUT or OUTPUT
        d[0]        d[1]        d[2] . . .
         |           |           |
 iv0 -> XOR  +----> XOR  +----> XOR
         |   |       |   |       |
        S00  |      S10  |      S20
         *-- | -+    *-- | -+    *--> c1
 iv1 -> XOR  |  +-> XOR  |  +-> XOR
         *---*       *---*       *--> c0
 iv2 -> XOR  |  +-> XOR  |  +-> XOR
         *-- | -+    *-- | -+    *--> c2
        S01  |      S11  |      S21
         |   |       |   |       |
 iv0 -> XOR  +----> XOR  +----> XOR
         |           |           |
        XOR <----+  XOR <----+  XOR <- iv4
         |       |   |       |   |
        S03      |  S13      |  S13
   c5 <--*    +- | --*    +- | --*
        XOR <-+  |  XOR <-+  |  XOR <- iv5
   c4 <--*       *---*       *---*
        XOR <-+  |  XOR <-+  |  XOR <- iv6
   c6 <--*    +- | --*    +- | --*
        S04      |  S14      |  S14
         |       |   |       |   |
        XOR <----+  XOR <----+  XOR <- iv4
         |           |           |
        d[0]        d[1]        d[2] . . .
               OUTPUT or INPUT

The SubX VSBC design uses exclusive-OR combining in diffusion layers, and keyed, byte-wide 256-entry substitution tables in confusion layers. The particular table used in each position is the "next" table in sequence from an array of keyed tables. (Even if two blocks do end up having the same size, they probably will not have the same tables in the same positions.)

The SubX design uses the most primitive components, and so is graphically more complex than the other designs.


VSBC Latin Square (Ls4)

(init = 55ms, ciphering = 236 KB/sec)

        d[0]        d[1]        d[2] . . .
         |           |           |
 iv0 -> Lsc  +----> Lsc  +----> Lsc
         *-- | -+    *-- | -+    *--> c1
 iv1 -> Lsc  |  +-> Lsc  |  +-> Lsc
         *---*       *---*       *--> c0
 iv2 -> Lsc  |  +-> Lsc  |  +-> Lsc
         *-- | -+    *-- | -+    *--> c2
 iv0 -> Lsc  +----> Lsc  +----> Lsc
         |           |           |
        Lsc <----+  Lsc <----+  Lsc <- iv4
   c5 <--*    +- | --*    +- | --*
        Lsc <-+  |  Lsc <-+  |  Lsc <- iv5
   c4 <--*       *---*       *---*
        Lsc <-+  |  Lsc <-+  |  Lsc <- iv6
   c6 <--*    +- | --*    +- | --*
        Lsc <----+  Lsc <----+  Lsc <- iv4
         |           |           |
        d[0]        d[1]        d[2] . . .

The Ls4 (four Latin squares per data element per diffusion direction) VSBC design uses Latin square combining, which simultaneously provides both diffusion and confusion in the same layer. A single keyed Latin square of order 256 is used, requiring 64K of store and representing a 3296-bit key. (The prototype RNG is a jitterized Additive RNG with 496 bits of initial state.)

The overall structure is quite like the SubX design, but differs in that the "table" (actually, the Ls row) used at each node is here selected by diffusion data, instead of some value related to node position. Since The Opponent does not see the diffusion data, it is going to be tough to isolate a particular "table" to be attacked separately.


VSBC Latin Square / XOR (LsX)

(init = 55ms, ciphering = 372 KB/sec)

        d[0]           d[1]           d[2] . . .
         |              |              |
 iv0 -> XOR  +-------> XOR  +-------> XOR
         *-- | ----+    *-- | ----+    *--> c2
 iv1 -> Lsc  |  +----> Lsc  |  +----> Lsc
         *---*  |  |    *---*  |  |    *--> c0
 iv2 -> Lsc  |  |  +-> Lsc  |  |  +-> Lsc
         *-- | -+      *--- | -+       *--> c1
 iv0 -> XOR  +-------> XOR  +-------> XOR
         |              |              |
        XOR <-------+  XOR <-------+  XOR <- iv4
   c6 <--*    +---- | --*    +---- | --*
        Lsc <----+  |  Lsc <----+  |  Lsc <- iv5
   c4 <--*    |  |  *---*    |  |  *---*
        Lsc <-+  |  |  Lsc <-+  |  |  Lsc <- iv6
   c5 <--*       +- | --*       +- | --*
        XOR <-------+  XOR <-------+  XOR <- iv4
         |              |              |
        d[0]           d[1]           d[2] . . .

The LsX VSBC design uses Latin square layers for both diffusion and confusion, and exclusive-OR combining for other diffusion layers. This design also demonstrates a somewhat different feedback architecture.


VSBC Substitution / Balanced Block Mixing (SBBM)

(init = 15ms, ciphering = 439 KB/sec)

   d[0]    d[1]    d[2]    d[3] . . .
    |       |       |       |
   S00     S10     S20     S30
    |       |       |       |
    +----> BBM --> BBM --> BBM -->      -----+
            |       |       |                |
           S01     S11     S21              Sy1
            |       |       |                |
            +----> BBM --> BBM -->      --> BBM -----+
                    |       |                |       |
                   S02     S12              Sx2     Sy2
                    |       |                |       |
            +----- BBM <-- BBM <--      <-- BBM <----+
            |       |       |                |
           S03     S13     S23              Sy3
            |       |       |                |
    +----- BBM <-- BBM <-- BBM <--      <----+
    |       |       |       |
   S04     S14     S24     S34
    |       |       |       |
   d[0]    d[1]    d[2]    d[3] . . .

The SBBM VSBC design uses the fast and simple Balanced Block Mixing component in the diffusion layers, and byte-wide keyed substitution tables in the confusion layers. Again, these tables are used in cyclic sequence from an array of keyed tables, so the same tables may or may not occur in the same positions in some other block.


VSBC Orthogonal Latin Square (OLs)

(init = 76ms, ciphering = 369 KB/sec)

   d[0]    d[1]    d[2]    d[3] . . .
    |       |       |       |
    +----> OLs --> OLs --> OLs -->      -----+
            |       |       |                |
            +----> OLs --> OLs -->      --> OLs -----+
                    |       |                |       |
            +----- OLs <-- OLs <--      <-- OLs <----+
            |       |       |                |
    +----- OLs <-- OLs <-- OLs <--      <----+
    |       |       |       |
   d[0]    d[1]    d[2]    d[3] . . .

This OLs VSBC design uses a keyed orthogonal pair of Latin squares which realize a keyed Balanced Block Mixer at each node. The OLs structure fills 128K and also represents 3296 bits of key.


SUMMARY

Any of these ciphers or prototypes can be the basis of a new custom cipher.


Terry Ritter, his current address, and his top page.

Last updated:1996-02-25