Ritter's Crypto Glossary and
Dictionary of Technical Cryptography

Technical Cryptographic Terms Explained

Laugh at deceptive claims!
Learn why cryptography cannot be guaranteed!

See Cryptography's Hall of Shame!: Crypto Controversies

A Ciphers By Ritter Page

Terry Ritter

2007 August 16

Copyright 1995 to 2007 Terry Ritter. All Rights Reserved.

For a basic introduction to cryptography, see "Learning About Cryptography" @: http://www.ciphersbyritter.com/LEARNING.HTM. Please feel free to send comments and suggestions for improvement to: ritter@ciphersbyritter.com (you may need to copy and paste the address into a web email reader). You may wish to help support this work by patronizing "Ritter's Crypto Bookshop" at: http://www.ciphersbyritter.com/BOOKSHOP.HTM.


Index

0 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Or use the browser facility "Edit / Find on this page" to search for particular terms.


Major Topics


Contents

Introduction

0
1/f Noise, 8b10b
A
Abelian, Absolute, AC, Academic, Academic Break, Access, Access Control, Accident Fallacy, Accountability, Accuracy, Acronym, Active, ad baculum, Additive Combiner, Additive RNG, Additive Stream Cipher, Adduction, ad hoc, ad hominem, ad ignorantium, ad nauseam, ad populum, ad verecundiam, AES, Affine, Affine Boolean Function, Affine Cipher, Algebra, Algebraic Normal Form, Algebra of Secrecy Systems, Algorithm, Algorithmic Complexity, Alias File, Allan Variance, All or Nothing Transform, Alphabet, Alternative Hypothesis, Amphiboly, Amplifier, Amplitude, Anagram, Analog, Analogy, Analysis, AND, ANF, Anode, Antecedent, AONT, Appeal to Ignorance, Appeal to Tradition, Arc, Argument, Argumentation, Argument By Innuendo, Arity, ASCII, Associative, Assumption, Asymmetric Cipher, Asynchronous, Asynchronous Stream Cipher, Asynchronous Transmission, Attack, Attack Tree, Augmented Repetitions, Authentication, Authenticating Block Cipher, Authority, Autocorrelation, AUTODIN, Autokey, Automorphism, AUTOSEVOCOM, Availability, Avalanche, Avalanche Effect, Avalanche Multiplication, Axiom
B
Back Door, Balance, Balanced Block Mixer, Balanced Block Mixing, Balanced Combiner, Balanced Line, Bandwagon, Base-64, Base Spreading Resistance, BBM, BBS, BB&S, Begging the Question, Bel, Belief, Bent Function, Berlekamp-Massey, Bernoulli Trials, Bias, Bijection, Bijective, Bijective Compression, Binary, Binomial Distribution, Bipolar, Birthday Attack, Birthday Paradox, Bit, Bit Balance, Bit Permutation, Bit Permutation Cipher, Bit Shuffling, Bit Transposition, Black, Black Box, Block, Block Cipher, Block Cipher Definitions, Block Cipher and Stream Cipher Models, Block Code, Block Size, Blum, Blum and Shub, Boolean, Boolean Algebra, Boolean Function, Boolean Function Nonlinearity, Boolean Logic, Boolean Mapping, Braid, Branch Number, Break, Brute Force Attack, Bug, Burden of Proof, Butterfly, Bypass, Byte
C
C, CA, Capacitor, Cardinal, Card Stacking, Cartesian Product, Cascade, Cascade Ciphering, Cathode, CBC, c.d.f., Certify, Certification Authority, CFB, Chain, Chance, Chaos, Characteristic, Checkerboard Construction, Checksum, Chi-Square, Chosen Plaintext, Cipher, Cipher Block Chaining, Ciphering, Cipher System, Cipher Taxonomy, Cipher Testing, Ciphertext, Ciphertext Expansion, Ciphertext Feedback, Ciphertext Only, Ciphertext Only Attack, Ciphony, Circuit, Circular Argument, circulus in demonstrando, circulus in probando, Claim, Cleartext, Cloak2, Clock, Closure, Code, Codebook, Codebook Attack, Codebreaking, Codeword, Coding Theory, Coefficient, Cognitive Dissonance, Combination, Combinatoric, Combiner, Common Mode, Commutative, Complete, Complex Number, Complex Question, Component, Composite, Composition, Compression, Compromise, Computer, COMSEC, Conclusion, Condition, Conductor, Confidential, Confusion, Confusion Sequence, Congruence, Conjecture, Consequent, Conspiracy, Constant, Contextual, Contradiction, Conventional Block Cipher, Conventional Cipher, Conventional Current Flow, Convolution, Copyright, Corollary, Correlation, Correlation Coefficient, Counterexample, Counter Mode, Counting Number, Covariance, Coverage, CRC, Crib, CRNG, CRT, Cryptanalysis, Cryptanalyst, Crypto Controversies, Cryptographer, Cryptographic Hash, Cryptographic Mechanism, Cryptographic Random Number Generator, Cryptography, Cryptography War, Cryptology, Cryptosystem, Crystal, Crystal Oscillator, Current, Cycle, Cyclic Group, Cypher
D
Data, Data Compression, Data Fabrication, Data Falsification, Data Security, dB, DC, Debug, Decade, Deception, Decibel, Decimal, Decimation, Decipher, Decoupling, Decryption, Deductive Reasoning, Defined Plaintext, Defined Plaintext Attack, Degenerate Cycle, Degree, Degrees of Freedom, DeMorgan's Laws, Depletion Region, DES, Design Strength, Deterministic, Deus ex Machina, DH, Dialectic, Dichotomy, Dictionary Attack, Dictionary Fallacy, Differential Cryptanalysis, Differential Mode, Diffie Hellman, Diffusion, Digital, Digital Signature, Diode, Distinguisher, Distribution, Distributive, Divide and Conquer, Division, Dogma, Domain, Double Shuffling, DSA, DSP, DSS, Due Care, Due Diligence, Dyadic, Dynamic Keying, Dynamic Substitution Combiner, Dynamic Transposition
E
Ebers-Moll Model, ECB, ECC, ECDSA, EDE, Efficiency, Electric Field, Electromagnetic Field, Electromagnetic Interference, Electrostatic Discharge, Electronic, Electronic Codebook, EMI, Encipher, Encryption, Enemy, Engineering, Ensemble Average, Entropy, Equation, Equivocation, Ergodic, Ergodic Process, Error Correcting Code, Error Detecting Code, ESD, Even Distribution, Evidence, Exclusive-OR, Expectation, Exposure, Expression, Extraordinary Claims, Extractor
F
Fq, Fq*, (Fq,+), Factor, Factorial, Failure, Failure Modes and Effects Analysis, Fallacy, Fast Walsh Transform, Fault, Fault Tolerance, Fault Tree Analysis, FCSR, Feedback, Feistel, Feistel Construction, Fenced DES, Fencing, Fencing Layer, FFT, Field, FIFO, Filter, Finite Field, Finite State Machine, Flat Distribution, Flip-Flop, Flow Control, Formal Proof, Fourier Series, Fourier Theorem, Fourier Transform, Frequency, FSM, Function, FWT
G
(G,*), Gain, Galois Field, Game Theory, Garble, Gate, Gaussian, GCD, Geffe Combiner, Geiger-Mueller Tube, Generator, GF(x), GF(2), GF(2n), GF(2)[x], GF(2)[x]/p(x), Goodness of Fit, Gray Code, Greek Alphabet, Ground, Ground Loop, Group
H
Hadamard, Hamming Distance, Hardware, Hash, Hazzard, Heuristic, Hex, Hexadecimal, Hidden Markov Model, Hold Time, Homomorphism, Homophonic, Homophonic Substitution, HTTP Status Codes, Huge Block Cipher Advantages, Hybrid, Hypothesis
I
IDEA, Ideal Mixing, Ideal Secrecy, Identity, Identity Element, ignoratio elenchi, i.i.d., Imaginary Number, Impedance, Impedance Matching, Impossible, Improbable, Independent, Independent Variable, Inductive Reasoning, Inductor, Inference, Informal Proof, Information, Injective, Innuendo, Insulator, Integer, Integrity, Intellectual Property, Intermediate Block, Interval, Into, Invention, Inverse, Inverter, Invertible, Involution, Irreducible, Iterated Block Cipher, IV
J
Jitter, Jitterizer, Johnson Noise, Just Semantics
K
Karnough Map, KB, Kb, Kerckhoffs' Requirements, Key, Key Archives, Key Authentication, Key Distribution Problem, Key Loading, Key Loss, Key Management, Key Problems, Key Reuse, Key Selection, Key Storage, Key Transport, Keyed Substitution, Keyphrase, Key Schedule, Keyspace, Keystream, Keystroke, Known Plaintext, Known Plaintext Attack, Kolmogorov-Chaitin Complexity, Kolmogorov-Smirnov
L
Latency, Latin Square, Latin Square Combiner, Law, Layer, LC, Lemma, Letter Frequencies, LFSR, Lie, LIFO, Linear, Linear Complexity, Linear Cryptanalysis, Linear Equation, Linear Factor, Linear Feedback Shift Register, Linear Logic Function, Log2, Logic, Logic Fallacy, Logic Function, Logic Level, lsb, lsB
M
MAC, M-Sequence, Machine Language, Magnetic Field, Man-in-the-Middle Attack, Mapping, Markov Chain, Markov Process, Mathematical Cryptography, Math Notation, Maximal Length, MB, Mb, MDS Codes, Mean, Mechanism, Mechanistic Cryptography, Median, Mere Semantics, Mersenne Prime, Message, Message Archives, Message Authentication, Message Authentication Code, Message Digest, Message Integrity, Message Key, Metastability, Method of Proof and Refutations, Military Grade, Misdirection, MITM, Mixing, Mixing Cipher, Mixing Cipher Design Strategy, Mod 2, Mod 2 Polynomial, Mode, Modulo, Monadic, Monoalphabetic Substitution, Monographic, Monoid, Monomial, msb, msB, M-Sequence, Multipermutation, Multiple Anagramming, Multiple Ciphering, Multiple Encryption
N
N, National Cryptologic Museum, Natural Number, Negative Resistance, NIST, Noise, Nomenclator, Nominal, non causa pro causa, Nonce, Nonlinearity, Nonrepudiaiton, non sequitur, Normal Distribution, NOT, Novelty, NSA, Null, Null Distribution, Null Hypothesis
O
OAEP, Object Code, Objective, Occam's Razor, Octal, Octave, OFB, Ohm's Law, Old Wives' Tale, oLs, One-Sided Test, One-Tailed Test, One Time Pad, One-To-One, One Way Diffusion, One Way Hash, Onto, Op Amp, Opcode, Operating Mode, Operational Amplifier, Opponent, Option, OR, Order, Ordinal, Orthogonal, Orthogonal Latin Squares, Oscillator, OTP, Output Feedback, Overall Diffusion
P
P-Value, Padding, Paradox, Parallel, Parity, Password, Patent, Patent Changes, Patent Claims, Patent Complaints, Patent Consequences, Patenting Cryptography, Patenting Software, Patent Infringement, Patent Reading, Patent Valuation, Path, Pedantic, Penknife, Perfect Secrecy, Period, Permutation, petitio principii, PGP, Phase, Phase Locked Loop, Phase Noise, Physically Random, Piezoelectric, Pink Noise, Pipeline, PKCS, PKI, Plagiarism, Plaintext, PLL, PN Sequence, Poisson Distribution, Polyalphabetic Combiner, Polyalphabetic Substitution, Polygram Substitution, Polygraphic, Polynomial, Polyphonic, Population, Population Estimation, post hoc ergo propter hoc, Power, Practice, PRBS, Precision, Predictable, Premise, Primitive, Primitive Polynomial, Prime, Prior Art, Privacy, PRNG, Probability, Process, Product Cipher, Proof, Propaganda, Proposition, Pseudorandom, PTO, Public Key Cipher, Public Key Infrastructure, Pure Cipher
Q
Qbit, Quality, Quality Management, Quantum Computer, Quantum Cryptography, Quartz, Quoting Out Of Context
R
R, R[x], R/I, Random, Randomness Testing, Random Number, Random Number Generator, Random Sampling, Random Variable, Random Walk, Range, Rationalization, Ratio Transformer, RC Filter, Reactance, Really Random, Real Number, Reasoning, Red, Red Herring, reductio ad absurdum, Redundancy, Relation, Relative Entropy, Relatively Prime, Relay, Reliability, Re-Originating Stream Cipher, Research Hypothesis, Resistor, Resistor Excess Noise, Resonance, Reverse CRC, Rhetoric, Ring, Ringing, Risk, Risk Analysis, Risk Management, Root, RMS, Root Mean Square, RNG, Round, RSA, Rule of Thumb, Running Key
S
SAC, Safe Prime, Salt, Sample, S-Box, Scalable, Scalar, Schematic Diagram, Science, Scientific Method, Scientific Model, Scientific Paper, Scientific Publication, Scrambler, Secrecy, Secret Code, Secret Key Cipher, Security, Security Through Obscurity, Seed, Self Inverse, Self-Synchronizing Stream Cipher, Semiconductor, Semigroup, Series, Session, Session Key, Set, Setup Time, Shannon, Shielding, Shift Register, Shot Noise, Shuffle, Sieve of Eratosthenes, Significance, Simple Substitution, Sine Wave, Single Point of Failure, Smooth Number, Snake Oil, Socrates, Socratic Method, Software, Software Engineering, Software Patent, Sophist, Sophistry, Source Code, S-P, Special Prime, Special Pleading, Specification, Speculation, Spin, SPN, Squares, Square Wave, SSL, Stability, Stake, Standard Cipher, Standard Deviation, State, State Machine, Static Electricity, Stationary Process, Statistic, Statistics, Steganography, Stochastic, Stochastic Process, Straw Man, Stream, Stream Cipher, Strength, Strict Avalanche Criterion (SAC), Structured Programming, Subjective, Substitution, Substitution Cipher, Substitution-Permutation, Substitution Table, Superencipherment, Superencryption, Superposition, Surjective, Switch, Switching Function, Symmetric Cipher, Symmetric Group, Synchronization, Synchronous, Synchronous Stream Cipher, Synthesis, System, System Design
T
Table Selection Combiner, Tautology, Taxonomy, Temperature, TEMPEST, Temporal Average, Term, Term of Art, Test, Theorem, Theory, Thermal Noise, Thesis, Threat, Threat Model, Time Constant, Trademark, Trade Secrecy, Traffic Analysis, Trajectory, Transformer, Transistor, Transistor Saturation, Transistor Self-Bias, Transposition, Transposition Cipher, Trap Door, Tree Analysis, Trinomial, Triple DES, TRNG, Trojan Horse, Truly Random, Trust, Truth Table, tu quoque, Tunneling, Two-Sided Test, Two-Tailed Test, Type I Error, Type II Error
U
UFN, Unary, Unbalanced Feistel Network, Uncertainty, Unexpected Distance, Unicity Distance, Uniform Distribution, Universe, Unknowable, Unpredictable, User Authentication
V
Variable, Variable Size Block Cipher, Variance, VCO, Vector, VENONA, Vernam Cipher, Vet, Voltage, Voltage Controlled Oscillator, Voltage Divider, Vulnerability
W
Walsh Functions, Walsh-Hadamard Transform, Weak Key, Weight, Whitening, White Noise, WHT, Wide Trail Strategy, Wire, Work Characteristic
X
XOR, XOR Encryption
Z
Z, Zener Breakdown, Zeroize, Zn, Zn*, (Z,+), (Z,+,*), Z/n, Z/p, Z/pZ, (Z/pZ)[x]

Introduction to the Crypto Glossary

This Glossary started as a way to explain the terms on my cryptography web pages describing my:

Having a Glossary meant I could reduce the text on most pages, while expanding background for the definitions, and relating the ideas to other similar, contradictory, or more basic ideas.

Why Bother with Definitions?

The value of a definition is insight. But:

  • Simple descriptions are not always possible.
  • Terms have meaning within particular contexts.
  • Tedious examples may be required to expose the full meaning.
Good definitions can expose assumptions and provide a basis for reasoning to larger conclusions.

Consider the idea that cryptography is used to keep secrets: We expect a cipher to win each and every contest brought by anyone who wishes to expose secrets. We call those people opponents, but who are they really, and what can they do? In practice, we cannot know. Opponents operate in secret: We do not know their names, nor how many they are, nor where they work. We do not know what they know, nor their level of experience or resources, nor anything else about them. Because we do not know our opponents, we also do not know what they can do, including whether they can break our ciphers. Unless we know these things that cannot be known, we cannot tell whether a particular cipher design will prevail in battle. We cannot expect to know when our cipher has failed.

Even though the entire reason for using cryptography is to protect secret information, it is by definition impossible to know whether a cipher can do that. Nobody can know whether a cipher is strong enough, no matter how well educated they are, or how experienced, or how well connected, because they would have to know the opponents best of all. The definition of cryptography implies a contest between a cipher design and unknown opponents, and that means a successful outcome cannot be guaranteed by anyone.

Sometimes the Significance is Implied

Consider the cryptographer who says: "My cipher is strong," and the cryptanalyst who says: "I think your cipher is weak." Here we have two competing claims with different sets of possibilities: First, the cryptographer has the great disadvantage of not being able to prove cipher strength, nor to even list every possible attack so they can be checked. In contrast, the cryptanalyst might be able to actually demonstrate weakness, but only by dint of massive effort which may not succeed, and will not be compensated even if it does. Consequently, most criticisms will be extrapolations, possibly based on experience, and also possibly wrong.

The situation is inherently unbalanced, with a bias against the cryptographer's detailed and thought-out claims, and for mere handwave first-thoughts from anyone who deigns to comment. This is the ultimate conservative bias against anything new, and for the status quo. Supposedly the bias exists because if the cryptographer's claim is wrong user secrets might be exposed. But the old status-quo ciphers are in that same position. Nothing about an old cipher makes it necessarily strong.

Unfortunately, for users to benefit from cryptography they have to accept some strength argument. Even more unfortunately:

  • Many years of trusted use do not testify about strength, but do provide both motive and time for opponents to develop secret attacks.
  • Many failures to break a cipher do not imply it is strong.
  • There can be no expertise on the strength of unbroken ciphers.
So on the one hand we need a cipher, and on the other have no way to know how strong the various ciphers are. For an industry, this is breathtakingly disturbing.

In modern society we purchase things to help us in some way. We go to the store, buy things, and they work. Or we notice the things do not work, and take them back. We know to take things back because we can see the results. Manufactured things work specifically because design and production groups can test which designs work better or worse or not at all. In contrast, if the goal of cryptography is to keep secrets, we generally cannot expect to know whether our cipher has succeeded or failed. Cryptography cannot test the fundamental property of interest: whether or not a secret has been kept.

The inability to test for the property we need is an extraordinary situation; perhaps no other manufactured thing is like that. Because the situation is unique, few understand the consequences. Cryptography is not like other manufactured things: nobody can trust it because nobody can test it. Nobody, anywhere, no matter how well educated or experienced, can test the ability of an unbroken cipher to keep a secret in practice. Thus we see how mere definitions allow us to deduce fundamental limitations on cryptography and cryptanalysis by simple reasoning from a few basic facts.

Relationships Between Ideas

The desire to expose relationships between ideas meant expanding the Glossary beyond cryptography per se to cover terms from related areas like electronics, math, statistics, logic and argumentation. Logic and argumentation are especially important in cryptography, where measures are few and math proofs may not apply in practice.

This Crypto Glossary is directed toward anyone who wants a better understanding of what cryptography can and cannot do. It is intended to address basic cryptographic principles in ways that allow them to be related, argued, and deeply understood. It is particularly concerned with fundamental limits on cryptography, and contradictions between rational thought and the current cryptographic wisdom. Some of these results may be controversial.

The Glossary is intended to build the fundamental understandings which lie at the base of all cryptographic reasoning, from novice to professional and beyond. It is particularly intended for users who wish to avoid being taken in by attacker propaganda. (Propaganda is an expected part of cryptography, since it can cause users to take actions which make things vastly easier for opponents.) The Glossary is also for academics who wish to see and avoid the logic errors so casually accepted by previous generations. One goal of the Glossary is to clarify the usual casual claims that confuse both novices and professionals. Another is to provide some of the historical technical background developed before the modern mathematical approach.

Reason in Cryptography

The way we understand reality is to follow logical arguments. All of us can do this, not just professors or math experts. Even new learners can follow a cryptographic argument, provided it is presented clearly. So, in this Glossary, one is occasionally expected to actually follow an argument and come to a personal conclusion. That can be scary when the result contradicts the conventional wisdom; then one then starts to question both the argument and the reasoning, as I very well know. But that scary feeling is just an expected consequence of a field which has allowed various unsupported claims and unquestioned beliefs to wrongly persist (see old wives' tales).

Unfortunately, real cryptography is not well-modeled by current math (for example, see proof and cryptanalysis). It is normally expected that the link between theory and reality is provided by the assumptions the math requires. (Obviously, proof conclusions only apply in practice when every assumed quality actually occurs in practice.) In math, each of these assumptions has equal value (since the lack of any one will void the conclusion), but in practice some assumptions are more equal than others. Certain assumptions conceivably can be guaranteed by the user, but other assumptions may be impossible to guarantee. When a model requires assumptions that cannot be verified in practice, that model cannot predict reality.

Current mathematical models almost never allow situations where the user can control every necessary assumption, making most proof results meaningless in practice. In my view, mathematical cryptography needs practical models. Of course, one might expect more realistic models to be less able to support the current plethora of mathematical results. Due to the use of more realistic models, some results in the Crypto Glossary do contradict well-known math results.

Opposing Philosophies

By carrying the arguments of conventional cryptographic wisdom to their extremes, it is possible to see two opposing groups, which some might call theoretical versus practical. While this simplistic model is far too coarse to take very seriously, it does have some basis in reality.

The Crypto Theorists supposedly argue that no cryptosystem can be trusted unless it has a mathematical proof, since anything less is mere wishes and hope. Unfortunately, there is no such cryptosystem. No cipher can be guaranteed strong in practice, and that is the real meaning of the one time pad. As long as even one unbreakable system existed, there was at least a possibility of others, but now there is no reason for such hope. The OTP is secure only in simplistic theory, and strength cannot be guaranteed in practice for users. This group seems most irritating when they imply that math proofs are most important, even when in practice those proofs provide no benefits to the user.

The Crypto Practitioners supposedly argue that systems should be designed to oppose the most likely reasonable threats, as in physical threat model analysis. In the physical world it is possible to make statements about limitations of opponents and attacks; unfortunately, few such statements can be made in cryptography. In cryptography, we know neither the opponents nor their attacks nor what they can do in combination. Successful attack programs can be reproduced and then applied by the most naive user, who up to that time had posed only the most laughable threat.

Both groups are wrong: There will be no proof in practice, and speculating on the abilities of the opponents is both delusional and hopeless. Moreover, no correct compromise seems possible. Taking a little proof from one side and some threat analysis from the other simply is not a valid recipe for making secure ciphers.

There is a valid recipe for security and that is a growing, competitive industry of cipher development. Society needs more than just a few people developing a handful of ciphers, but actual design groups who continually innovate, design, develop, measure, attack and improve new ciphers in a continuing flow. That is expensive work, as the NSA budget clearly shows. Open society will get such results only if open society will pay for them. Since payment is the issue, it is clear that "free" ciphers act to oppose exactly the sort of open cryptographic development society needs.

Absent an industry of cipher design, perhaps the best we can do is to design systems in ways such that a cipher actually can fail, while the overall system retains security. That is redundancy, and is a major part of engineering most forms of life-critical systems (e.g., airliners), except for cryptography. The obvious start is multiple encryption.

What is the Point?

The practical worth of all this should be a serious regard for cryptographic risk. The possibility of cryptographic failure exists despite all claims and proofs to the contrary. Users who have something to protect must understand that cryptography has risks, and there is a real possibility of failure. If a possibility of information exposure is acceptable, one might well question the use of cryptography in the first place.

Even if users only want their information probably to be secure, they still have a problem: Only our opponents know our cipher failures, because they occur in secret. Our opponents do not expose our failures because they want those ciphers to continue in use. Few if any users will know when there is a problem, so we cannot count how many ciphers fail, and so cannot know that probability. Since there can be no expertise about what unknown opponents do, looking for an "expert opinion" on cipher failure probabilities or strength is just nonsense.

Conventional cryptographic expertise is based on the open literature. Unfortunately, unknown attacks can exist, and even the best informed cannot predict strength against them. While defending against known attacks may seem better than nothing, that actually may be nothing to opponents who have another approach. In the end, cipher and cryptosystem designers vigorously defend against attacks from academics who will not be their opponents.

On the other hand, even opponents read the open literature, and may make academic attacks their own. But surprisingly few academic attacks actually recover key or plaintext and so can be said to be real, practical threats. Much of the academic literature is based on strength assumptions which cannot be guaranteed or vulnerability assumptions which need not exist, making the literature less valuable in practice than it may appear.

Math cannot prove that a cipher is strong in practice, so we are forced to accept that any cipher may fail. We do not, and probably can not know the likelihood of that. But we do know that a single cipher is a single point of failure which just begs disaster. (Also see standard cipher.)

It is possible to design in ways which reduce risk. Systems can be designed with redundancy to eliminate the single point of failure (see multiple encryption). This is often the done in safety-critical fields, but rarely in cryptography. Why? Presumably, people have been far too credulous in accepting math proofs which rarely apply in practice. Thus we see the background for my emphasis on basics, reasoning, proof, and realistic math models.

Simple Encryption

To protect against fire, flood or other disaster, most software developers should store their current work off-site. The obvious solution is to first encrypt the files and then upload an archive to a web site. The straightforward use of cryptography to protect archives is an example of the pristine technical situation often seen as normal. Then we think of cipher strength and key protection, which seem to be all there is. But most cryptography is not that simple.

Climate of Secrecy. For any sort of cryptography to work, those who use it must not give away the secrets. Most times keeping secrets is as easy, or as hard, as just not talking or writing about them. Issues like minimizing paper output and controlling and destroying copies seem fairly obvious, although hardly business as usual. But secrets are almost always composed in plaintext, and the computers doing that may have plaintext secrets saved in various hidden operating system files. And opponents may introduce programs to compromise computers which handle secrets. It is thus necessary to control all forms of access to equipment which holds secrets, despite that being awkward and difficult. It is especially difficult to control access on the net.

Network Security. Computers only can do what they are told to do. When network designers decide to include features which allow attacks, that decision is as much a part of the problem as an attack itself. It seems a bit much to complain about insecurity when insecurity is part of the design. Design decisions have made the web insecure. Until web systems only implement features which maintain security, there can be none.

It is possible to design computing systems more secure than the ones we have now. If we provide no internal support for external attack, no attacks can prevail. The entire system must be designed to limit and control external web access and prevent surprises that slip by unnoticed. We can decompose the system into relatively small modules, and then test those modules in a much stronger way than trying to test a complex program. A possible improvement might be some form of restricted intermediate or quarantine store between the OS and the net. Better security design may mean that some things now supported insecurely no longer can be supported at all.

Current practice identifies two environments: The local computer, which is "fully" trusted, and the Internet, which is not trusted. This verges on a misuse of the concept of trust, which requires substantial consequences for misuse or betrayal. Absent consequences, trust is mere unsupported belief and provides no basis for reasoning. We do not trust a machine per se, since it only does what the designer made it do. And when there are no consequences for bad design, there really is no reason to trust the designer either.

A better approach would be fine OS control over individual programs, including individual scripts, providing validation and detailed limits on what each program can do, on a per-program basis. This would expand the firewall concept from just net access to every resource, including processor time, memory, all forms of I/O, plus the ability to invoke, or be invoked by, other programs. For example, most programs do not need, and so would not be allowed, net access, even if invoked by a program or running under a process which has such access. Programs received from the net would by default start out in quarantine, not have access to normal store, and could run only under strong limitations. A human would have to explicitly elevate them to a selected higher status, with the change logged. Program operation exceeding limitations would be prevented, logged, and accumulated in a control which supported validation, fine tuning, selective responses and serious quarantine.

Security is Off-The-Net. The best way to avoid web insecurity has nothing to do with cryptography. The way to avoid web insecurity is to not connect to the web, ever. Use a separate computer for secrets, and do not connect it to the net, or even a LAN, since computers on the LAN probably will be on the net. Carefully move information to and from the secrets computer with a USB flash drive. Protect access to that equipment.

Glossary Structure and Use

For most users, the Crypto Glossary will have many underlined (or perhaps colored) words. Usually, those are hypertext "links" to other text in the Glossary; just click on the desired link.

Links to my other pages generally offer a choice between a "local" link or a full web link. The user working from a downloaded copy of the Glossary only would normally use the full web links. The user working from a CD or disk-based copy of all my pages would normally use the local links.

Links to my other pages also generally open and use another window. (Hopefully that will avoid the need to reload the Glossary after a reference to another article.) Similarly, links from my other pages to terms in the Glossary also generally open a window specifically for the Glossary. (In many cases, that will avoid reloading the Glossary for every new term encountered on those pages.)

In cryptography, as in much of language in general, the exact same word or phrase often is used to describe two or more distinct ideas. Naturally, this leads to confused, irreconcilable argumentation until the distinction is exposed (and often thereafter). Usually I handle this in the Crypto Glossary by having multiple numbered definitions, with the most common usage (not necessarily the best usage) being number 1.

The worth of this Glossary goes beyond mere definitions. Much of the worth is the relationships between ideas: Hopefully, looking up one term leads to other ideas which are similar or opposed or which support the first. The Glossary is a big file, but breaking it into many small files would ruin much of the advantage of related ideas, because then most related terms would be in some other part. And although the Glossary could be compressed, that would generally not reduce download time, because most modems automatically compress data during transmission anyway. Dial-up users typically should download the Glossary onto local storage, then use it locally, updating periodically.

Value

I have obviously spent a lot of personal time constructing this Crypto Glossary, with the hope that it would be more than just background to my work. Hopefully, the Glossary and the associated introduction: "Learning About Cryptography" (see locally, or @: http://www.ciphersbyritter.com/LEARNING.HTM) will be of some wider benefit to the crypto community. So, if you have used this Glossary lately, why not drop me a short email and tell me so? Feel free to tell me how much it helped or even how it failed you; perhaps I can make it better for the next guy. If you use web email, just copy and paste my email address: ritter@ciphersbyritter.com


1/f Noise
In electronics, a random-like analog signal with amplitude proportional to the inverse of frequency. Also called "flicker." Well known both in semiconductor electronics and physics. Not a white noise, but a pink noise.

Resistor excess noise is a 1/f noise generated in non-homogenous resistances, such as the typical thick-film surface-mount (SMT) resistor composed of conductor particles and fused glass. It is thought that DC current forms a preferential path through the conductive grains, a path that varies dynamically at random, thus modulating the resistance and creating noise. Homogenous metal films do not have 1/f noise.

The especially large amount of 1/f noise in MOSFET's could be understood if the glass layer the gate rests on is unexpectedly rough. That could could create islands of conduction (which some literature appears to support), which then act like resistive grains.

In a single-crystal semiconductor, 1/f noise may be related to the organization of atomic bonding at the outside surfaces, which must be different than inside the crystalline bulk material. If the semiconductor surface could be shown to be composed of conductive islands, that would be an enlightening result.

8b10b
A block code which represents 8-bit data as 10-bit values or codewords. This gives 1024 codewords to encode 256 values plus perhaps 12 new control codes; this freedom can be used to approach general bit balance in each codeword. However, since a 10-bit code has only 252 balanced values (whereas 256 data and perhaps 12 control symbols are required), balancing must extend across codewords. The encoding process must maintain a count of the current unbalance and correct that in the next codeword. Also see coding theory.

Abelian
In abstract algebra, a commutative group or ring.

Absolute
In the study of logic, something observed similarly by most observers, or something agreed upon, or which has the same value each time measured. Something not in dispute, unarguable, and independent of other state. As opposed to contextual.

AC
Alternating Current: Electrical power which repeatedly reverses direction of flow. As opposed to DC.

Generally used for power distribution because the changing current supports the use of transformers. Utilities can thus transport power at high voltage and low current, which minimize "ohmic" or I2R losses. The high voltages are then reduced at power substations and again by pole transformers for delivery to the consumer.

Academic
1. Scholarly.
2. Theoretical.
3. Conventional.
4. Impractical.

Academic Break
A break or technically successful attack which is also impractical. See: academic.

Access
The ability (right or permission) to interact with (approach, enter, speak with, read or use) some one or some thing.

Access Control
A possible goal of cryptography. The idea of restricting documents, equipment or keys to those authorized such access.

Accountability
Nonrepudiation. A goal of cryptography. Responsibility for messages sent.

Accuracy
The ratio of a measured value to the true value. A percentage of the measured value, surrounding the measured value, which is supposed to contain the correct value. Also see: precision.

Acronym
A word constructed from the beginning letters of each word in a phrase or a name or a title.

Active
1. In motion, or in use. In contrast to "passive."
2. An S-box whose input has changed.

Additive Combiner
An additive combiner uses numerical concepts similar to addition to mix multiple values into a single result. This is the basis for conventional stream ciphering. Also see extractor.

One example is byte addition modulo 256, which simply adds two byte values, each in the range 0..255, and produces the remainder after division by 256, again a value in the byte range of 0..255. The modulo is automatic in an addition of two bytes which produces a single byte result. Subtraction is also an "additive" combiner.

Another example is bit-level exclusive-OR which is addition mod 2. A byte-level exclusive-OR is a polynomial addition.

Additive combiners are linear, in contrast to nonlinear combiners such as:

Additive RNG
(Additive random number generator.) A LFSR-based RNG typically using multi-bit elements and integer addition (instead of XOR) combining. References include:
Knuth, D. 1981. The Art of Computer Programming, Vol. 2, Seminumerical Algorithms. 2nd ed. 26-31. Addison-Wesley: Reading, Massachusetts.
Marsaglia, G. and L. Tsay. 1985. Matrices and the Structure of Random Number Sequences. Linear Algebra and its Applications. 67:147-156.

Advantages include:

  • A long, mathematically proven cycle length.
  • Especially efficient software implementations.
  • Almost arbitrary initialization (some element must have its least significant bit set).
  • A simple design which is easy to get right.

In addition, a vast multiplicity of independent cycles has the potential of confusing even a quantum computer, should such a thing become realistic.

For Degree-n Primitive, and Bit Width w
   Total States:       2nw
   Non-Init States:    2n(w-1)
   Number of Cycles:   2(n-1)(w-1)
   Length Each Cycle:  (2n-1)2(w-1)
   Period of lsb:      2n-1

The binary addition of two bits with no carry input is just XOR, so the lsb of an Additive RNG has the usual maximal length period.

A degree-127 Additive RNG using 127 elements of 32 bits each has 24064 unique states. Of these, 23937 are disallowed by initialization (the lsb's are all "0") but this is just one unusable state out of 2127. There are still 23906 cycles which each have almost 2158 steps. (The Cloak2 stream cipher uses an Additive RNG with 9689 elements of 32 bits, and so has 2310048 unique states. These are mainly distributed among 2300328 different cycles with almost 29720 steps each.)

Like any other LFSR, and like any other RNG, and like any other FSM, an Additive RNG is very weak when standing alone. But when steps are taken to hide the sequence (such as using a jitterizer nonlinear filter and Dynamic Substitution combining), the resulting cipher can have significant strength.

Additive Stream Cipher
The conventional stream cipher, based on simple additive combining.

Adduction
In argumentation, the process of synthesizing a deep understanding of meaning, based on the analysis of multiple examples. Often used in the Socratic Method.

ad hoc
Something established for a particular one-time purpose.

AES
Advanced Encryption Standard. A 128-bit or 256-bit conventional block cipher replacement for DES. The new block cipher chosen by NIST for general use by the U.S. Government.

The mechanics of AES are widely available elsewhere. Here I note how one particular issue common to modern block ciphers is reflected in the realized AES design. That issue is the size of the implemented keyspace compared to the size of the potential keyspace for blocks of a given size.

A Block Cipher Model

A common academic model for conventional block ciphers is a "family of permutations." The "permutation" part of this means that every plaintext block value is found as ciphertext, but generally in a different position. The "family" part of this can mean every possible permutation. However, modern block ciphers key-select only an infinitesimal fraction of those possibilities.

Suppose we have a block which may take on any of n different values. How many ways can those n block values be rearranged as in a block cipher? Well, the first value can be placed in any of the n possible positions, but that fills one position so the second value has only n-1 positions available. Continuing on, the third has n-2 possibilities and so on for n different factors. Thus we find that the number of options is the same as the definition of factorial. The number of distinct permutations of n different values is n-factorial.

The Corresponding AES Model

A 128-bit key can select 2128 emulated tables. However, a 128-bit block has an alphabet of about 3.4x1038 different values, and so could have 3.4x1038 factorial emulated tables. That value is BIG, BIG, BIG, but still within range of my JavaScript page:

There we find that 3.4x1038 different values have on the order of 2(1040) distinct permutations. That value would take 1040 bits to represent, and can be directly compared to the 256 bits needed to represent the larger keys used in AES.

    A 128-bit block can be any one of 2128 or 3.4x1038 different values. To form a particular permutation, the first value can be placed in any of 3.4x1038 places, the second in 3.4x1038-1 places, and so on for 3.4x1038 different factors. As a ballpark calculation, we might expect 3.4x1038-factorial to be similar to (1038)1038. That would be the same as 2 to the power 1038 log2 1038, which is 2 to the power 1038 * 128, and that is about 2 to the power 1040, nicely confirming the JavaScript results.

AES Reality

For 128-bit blocks and 256-bit keys, AES provides:

  • 2256 keyed or emulated tables, out of about
  • 210,000,000,000,000,000,000,000,000,000,000,000,000,000 possibilities.

The obvious conclusion is that almost none of the keyspace implicit in the theoretical model of a conventional block cipher is actually implemented in AES, and that is consistent with other modern designs. Is that important? Apparently not, but nobody really knows. It does seem to imply that just a few known plaintext blocks should be sufficient to identify the correct key from a set of possibilities, which might make known plaintext more of an issue than normally claimed. Does it lead to a known break? No, or at least not yet. But having only a tiny set of keyed permutations should lead to questions about patterns and relationships within the selected set.

The real issue here is not the exposure of a particular weakness in AES, since no such weakness is shown. Instead, the issue is that conventional cryptographic wisdom does not force models to correspond to reality, and poor models lead to errors in reasoning. The distinction between theory and practice is pronounced in cryptography. For other examples of failure in the current cryptographic wisdom, see one time pad, BB&S, DES, and, of course, old wives' tale.

Is AES Enough for Government Secrets?

AES is said to be certified for SECRET and TOP SECRET classified material. That might have us believe that AES is trusted by NSA, but it may mean less than it seems.

No cipher, by itself, can guarantee security. Any cryptographic system will have to be certified by NSA before protecting classified information. In practice, cryptosystems will be provided by NSA to contractors, those systems may or may not use AES, and they may not use AES in the expected form. That does not imply that AES is bad, it just means that we cannot really know what NSA will allow, despite general claims.

Affine
Generally speaking, linear.

Technically, function f : G -> G of the form:

   f(x) = ax + b
with non-zero constant "b".

Affine Boolean Function
A Boolean function which can be represented in the form:
anxn + an-1xn-1 + ... + a1x1 + a0
where the operations are mod 2: addition is Exclusive-OR, and multiplication is AND.

Note that all of the variables xi are to the first power only, and each coefficient ai simply enables or disables its associated variable. The result is a single Boolean value, but the constant term a0 can produce either possible output polarity.

Here are all possible 3-variable affine Boolean functions (each of which may be inverted by complementing the constant term):

     affine    truth table

          c    0  0  0  0  0  0  0  0
         x0    0  1  0  1  0  1  0  1
      x1       0  0  1  1  0  0  1  1
      x1+x0    0  1  1  0  0  1  1  0
   x2          0  0  0  0  1  1  1  1
   x2+   x0    0  1  0  1  1  0  1  0
   x2+x1       0  0  1  1  1  1  0  0
   x2+x1+x0    0  1  1  0  1  0  0  1

See also: Boolean function nonlinearity.

Affine Cipher
One of the classic hand-ciphers, described mathematically as
    F(x) = ax + b (mod n)
where the non-zero term makes the equation affine.

Most of the classic hand-ciphers can be seen as simple substitution stream ciphers. Each plaintext letter selects an entry in the substitution table (for that cipher), and the contents of that entry becomes the ciphertext letter. The affine equation thus represents one way to set up the table, as a particular simple permutation of the letters in the table. (Of course, by using the equation we need no explicit table, but we also constrain ourselves to the simplicity of the equation.) To assure that we have a permutation, we require that a and n be relatively prime, that is, the gcd(a,n) = 1, or in number theory notation, just (a,n) = 1.

In modern terms, the strength of the classic substitution ciphers is essentially nil. In modern cryptanalysis, we generally assume that the opponent has a substantial amount of known plaintext. Since the table does not change, every known-plaintext character has the potential to fill in another entry in the table. Very soon the table is almost completely exposed, which ends all strength. These simple substitution ciphers with small, fixed tables (or even just equations for such tables) are also extremely vulnerable to attacks using ciphertext only.

Algebra
The use of variables and the valid manipulation of expressions in the study of numbers.

Algebraic Normal Form
(ANF). Typically, the symbolic representation of a mapping in the usual sum-of-products form.
  • For Boolean functions in symbolic form, each term is an input variable combination for which the output is '1'.
  • For Boolean functions in explicit form, basically a truth table: simply a list of the output value as it will occur when stepping through all possible input variable combinations, one-by-one. This is just the bit sequence of the output value as it would occur in input-variable order.

Algebra of Secrecy Systems
An oft-overlooked proposal by Shannon, describing both the construction of a multiple encryption cipher (called the "product") and the keyed selection of one from among many ciphers (called the "weighted sum").

"The first combining operation is called the product operation and corresponds to enciphering the message with the first secrecy system R and enciphering the resulting cryptogram with the second system S, the keys for R and S being chosen independently."

"The second combining operation is 'weighted addition.'

   S = pR + qS  p + q = 1.
It corresponds to making a preliminary choice as to whether system R or S is to be used with probabilities p and q, respectively. When this is done or R or S is used as originally defined." [p.658]

More specifically (and with a change of notation):

"If we have two secrecy systems T and R we can often combine them in various ways to form a new secrecy system S. If T and R have the same domain (message space) we may form a kind of 'weighted sum,'

   S = pT + qR
where p + q = 1. This operation consists of first making a preliminary choice with probabilities p and q determining which of T and R is used. This choice is part of the key of S. After this is determined T or R is used as originally defined. The total key of S must specify which of T and R is used, and which key of T (or R) is used."

"More generally we can form the sum of a number of systems.

   S = p1T + p2R + . . . +  pmU    Sum( pi ) = 1
We note that any system T can be written as a sum of fixed operations
   T = p1T1 + p2T2 + . . . +  pmTm
Ti being a definite enciphering operation of T corresponding to key choice i, which has probability p."

"A second way of combining two secrecy systems is taking the 'product,' . . . . Suppose T and R are two systems and the domain (language space) of R can be identified with the range (cryptogram space) of T. Then we can apply first T to our language and then R to the result of this enciphering process. This gives a resultant operation S which we write as a product

   S = RT
The key for S consists of both keys of T and R which are assumed chosen according to their original probabilities and independently. Thus if the m keys of T are chosen with probabilities
   p1p2 . . . pm
and the n keys of R have probabilities
   p'1p'2 . . . p'n ,
then S has at most mn keys with probabilities pi p'j. In many cases some of the product transformations Ri Tj will be the same and can be grouped together, adding their probabilities.

"Product encipherment is often used; for example, one follows a substitution by a transposition or a transposition by a Vigenere, or applies a code to the text and enciphers the result by substitution, transposition, fractionation, etc."

"It should be emphasized that these combining operations of addition and multiplication apply to secrecy systems as a whole. The product of two systems TR should not be confused with the product of the transformations in secrecy systems Ti Rj . . . ."

-- Shannon, C. E. 1949. Communication Theory of Secrecy Systems. Bell System Technical Journal. 28:656-715.

It is easy to dismiss this as being of historical interest only, but there are advantages here which are well beyond our current usage.

For the keyed selection among ciphers, there would be some sort of simple protocol (i.e., not cryptographic per se), for communicating cipher selections to the deciphering end. (Perhaps there would be some sort of simple handshake for email use.) The result would be to have (potentially) a new selection from a set of ciphers on a message-by-message basis.

  • Having frequent cipher changes guarantees that we can change ciphers, immediately and easily, if any cipher we use is found weak.
  • A cipher change terminates any existing break of a particular cipher which has been exposing our information. Since we cannot expect to know when a break exists, changing to a different cipher can minimize the effect of a cipher fault even though we know nothing about that fault.
  • Using different ciphers at different times prevents information from being concentrated under a single cipher. This prevents the opposing attack budget from concentrating on one target.
  • The ability to easily change ciphers supports the continued creation and use of new ciphers, which the opponents must then identify, obtain, analyze and break. Although single new cipher design costs can be distributed among users simply by selling product, each opponent must bear the full cost of analysis, since most attackers cannot cooperate. And as the set of ciphers continues to grow, the opponents may never catch up to the complete set of ciphers actually in use.
  • Cipher selection has minimal execution cost.

With respect to multiple encryption or ciphering "stacks" (as in "protocol stacks"), there are various security advantages:

  • A cipher stack prevents a single broken cipher from exposing our information. Since any particular cipher may be broken and we will not know (the opponents do not tell us), this protects against a dangerous single point of failure.
  • A three-cipher stack hides the known-plaintext (and "defined-plaintext") information for each individual cipher. Such information simply is not exposed to the opponents, which thus prevents known-plaintext attacks (and defined-plaintext attacks) on the individual ciphers. The construction thus eliminates whole classes of attack on the component ciphers.
  • A three-cipher stack gives us exponentially many different ciphering stack possibilities. The intent here is not to add keyspace, since reasonable ciphers already have enough keyspace. Instead, the point is the easy construction of many conceptually different overall ciphering functions which the opponents must engage.
  • Users who are "Nervous Nellies" could specify that their particular favorite cipher would always be part of the (changing) cipher stack, thus "guaranteeing" at least as much strength as using that cipher alone. (If the adjacent cipher was the same, in decipher mode, using the same key, then there would be no strength. So do not do that. If an arbitrary cipher was likely to reduce strength, that would be an attack, and we see no such attack.)
  • A three-layer cipher stack obviously has an execution cost of three layers of ciphering.

Also see: Perfect Secrecy and Ideal Secrecy.

Algorithm
The description of a sequence of operations which does something. Typically,
  • a finite procedure,
  • composed of discrete steps,
  • expressed in a fixed instruction vocabulary.
Also see heuristic, Structured Programming and software patent.

An algorithm intended to execute reliably as a computer program necessarily must handle, or in some way at least deal with, absolutely every error condition which can possibly occur in operation. (We do assume functional hardware, and thus avoid programming around the possibility of actual hardware faults, such as memory or CPU failure.) These "error conditions" normally include Operating System errors (e.g., bad parameters passed to an OS operation, resource not available, various I/O failures, etc.), and arithmetic issues (e.g., division by zero, overflow, etc.) which may halt execution when they occur.

Other possibilities include errors the OS will not know about, including the misuse of programmer-defined data structures, such as buffer overrun.

A practical algorithm must recognize various things which validly may occur, even if such things are exceedingly rare. One example might be in assuming that two floating-point variables which represent the same value will be equal. Another example might be to assume that a floating-point variable will "never" have some particular value (which might lead to a divide-by-zero fault). Yet another example would be to assume that an arbitrary selection of x will lead to a sufficiently long cycle in BB&S, even if the alternative is very, very unlikely.

Algorithmic Complexity
Kolmogorov-Chaitin complexity.

Alias File
My term for a cipher system computer file relating names to keys. This allows ordinary users to specify which key is to be used by using the far-end name, without knowing the actual key itself. Thus, the actual key can be long and random and can change over time and the user need not coordinate these changes.

In particular, my Cloak2 and Penknife ciphers implemented encrypted alias files of text lines of arbitrary length, each of which included name, start date, and key. New keys were made available only as secure ciphertext, but the alias files were arranged so they could consist of multiple ciphertext files simply concatenated as ciphertext. Thus, new keys could be added to the start of the alias file just using a simple and secure file copy operation. When searching for a particular alias, the date was also checked, and that key used only when the correct date had arrived. This allowed an entire office of users to change to a new key automatically, at the same time, without even knowing they were using a different key. Appropriate functions allowed access to old keys so that email traffic could be archived in ciphertext form.

Obviously, an alias file must be encrypted. The single key or keyphrase decrypting an alias file thus provides access to all the keys in the file. But each alias file contains only a subset of the keys in use within an organization, and even those are only valid over a subset of time. An organization security officer could archive old alias files, strip out the old keys and add new ones, then encipher the new alias file under a new pass phrase. In this way, the contents of old encrypted email would not be hidden from the authorizing organization. Alias file maintenance could be either as complex or as simple as one might like.

See, for example,

describing the Cloak2 cipher.

Allan Variance
A descriptive variance statistic based on deviation from preceding sample. This is computed as the sum of the squares of differences between each sample and the previous sample, divided by 2, and divided by the number of samples-1.

Allan Variance is useful in analysis of residual noise in precision frequency measurement. Five different types of noise are defined: white noise phase modulation, flicker noise phase modulation, white noise frequency modulation, flicker noise frequency modulation, and random walk frequency modulation. A log-log plot of Allan variance versus sample period produces approximate straight line values of different slopes in four of the five possible cases. A different (more complex) form called "modified Allan deviation" can distinguish between the remaining two cases.

Also see

All or Nothing Transform
(AONT). Basically the idea of a block mixing function in which knowing even all but one of the mixed outputs exposes none of the original input block values. As defined by Rivest in 1997:
"Definition. A transformation f mapping a message sequence m1,m2,...,ms into a pseudo-message sequence m1',m2',...,ms' is said to be an all-or-nothing transform if:
  • The transform f is reversible: given the pseudo-message sequence, one can obtain the original message sequence.
  • Both the transformation f and its inverse are efficiently computable (that is, computable in polynomial time).
  • It is computationally infeasible to compute any function of any message block if any one of the pseudo-message blocks is unknown."
-- Rivest, R. 1997. All or nothing encryption and the package transform. Fast Software Encryption 1997. 210-218.

When used with a conventional block cipher, an AONT appears to increase the cost of a brute-force attack by a factor which is the number of blocks in the message. Rivest also notes that the large effective block size can avoid ciphertext expanding chaining modes by using ECB mode on the large block. Also see huge block cipher advantages.

The Balanced Block Mixing (BBM) which I introduced to cryptography in my article: "Keyed Balanced Size-Preserving Block Mixing Transforms" ( locally, or @: http://www.ciphersbyritter.com/NEWS/94031301.HTM) in early 1994 (three years before the Rivest publication), and then developed in a series of subsequent articles, apparently can be an especially fine example of an all-or-nothing transform.

Alphabet
In cryptography, the set of symbols under discussion. Also see universe, population and cardinal.

Alternative Hypothesis
In statistics, the statement formulated to be logically contrary to the null hypothesis. The alternative hypothesis H1 includes every possible result other than the specific outcome specified in the null hypothesis.

The alternative hypothesis H1 is also called the research hypothesis, and is logically identical to "NOT-H0" or "H0 is not true."

Amplifier
a component or device intended to sense a signal and produce a larger version of that signal. In general, any amplifying device is limited by available power, frequency response, and device maximums for voltage, current, and power dissipation. Also see: voltage divider.

Transistors are analog amplifiers which are basically linear over a reasonable range and so require DC power. In contrast, relays are classically mechanical devices with direct metal-to-metal moving connections, and so can handle generally higher power and AC current. The classic analog amplifier is an operational amplifier.

Unexpected oscillation can be indicated by:

  • An unusually hot active device as felt by a finger.
  • Unexpectedly high current flow as shown by a multimeter.
  • Unexpected sounds as heard from a speaker monitoring the unit under test.
  • Unexpected output signal as seen on an oscilloscope.
  • Unexpected variation when touching various pins, as shown on a multimeter measuring the output signal, the output DC level or power supply current.
  • Unexpected signal or variation as shown by an RF voltage probe connected to multimeter.
  • Unexpected signal or variation as seen on a wideband AC voltmeter.

Oscillation occurs when:

  • an amplified signal finds its way back to the amp input; AND
  • the gain through the amplifier and feedback exceeds 1.0; AND
  • the total phase shift around the feedback loop is 360 degrees.

To stop undesired oscillation:

  • Increase isolation between input and output; OR
  • Decrease gain; OR
  • Change phase.

To Increase Isolation

  • Bypass the amplifier power pins. All current from the output pin originally comes through a power pin. Signal at the output is necessarily reflected in signal at the power pins. Unless power lines are bypassed with significant storage capacitance, signal on the output will feed back to the input. Serious bypassing should be a part of normal use. The negative supply often needs to be cleaner than the positive supply.
  • Decouple the amplifier power pins. Add series resistors to the power supply to form low-pass filters (in combination with the bypass capacitors), and thus decrease high-frequency feedback between stages via the supply.
  • Try moving input and output leads as far apart as possible. If that improves the situation, the feedback path has at least been identified.
  • Try preventing capacitive coupling between input and output. Interpose a conductive shield (try a finger) between in and out. If that helps, consider shielding the input and output lines. Or put in a permanent metallic shield like a piece of copper sheet or PC board material.
  • For units with single-ended input and output signals and high overall gain, try breaking the ground loop. Try isolation transformers on input and/or output signal lines. Any stage with 40dB or more of resulting gain can be a particular problem when output signal returns through the same ground used by the input signal.
  • Consider redesigning for balanced input. Balanced signal lines help prevent signal-line magnetic coupling and feedback.

To decrease gain:

  • Increase high frequency negative feedback. Try using a small capacitor across the feedback resistor to reduce overall gain starting at a high frequency. (Make capacitive reactance equal to feedback resistance perhaps an octave above the highest desired frequency.)
  • High-pass filter the input. A small series resistor and shunt capacitor form an RC filter that can take effect above the highest frequency of interest. Differential inputs each with their own input resistor can use a single shunt capacitor across the input pins.
  • Redesign for reduced stage gain. Use another stage if more gain is necessary.

To change phase:

  • Identify the frequency-determining path. Sometimes, touching various connections with fingers or a grounded capacitor will affect oscillation. If so, add components or change values to force a frequency which has insufficient gain to support oscillation.
  • For capacitive loads, try a small series output resistor. Capacitive loads as small as 50pF, or even just cable capacitance, are notorious for causing operational amplifier problems. A small resistor on the output (e.g., the 50 or 75 ohm characteristic impedance of driven coax) before the cable can settle things down considerably.
  • Try damping transformer resonance with a load resistor. Sometimes transformers can resonate with circuit capacitance.
  • Use a "snubber" series RC across resonant LC tanks. For resonances at least 100x the highest desired frequency, Ridley Engineering recommends a resistor equal to the inductive (or capacitive, since it is resonant) reactance at the resonant frequency (R = XL = 2 Pi f L), with a series capacitor of C = 1 / (Pi f R). An analysis from Hagerman Technology suggests a resistor of R = SQRT( L / C) with a capacitor equal to 1 / (R f), which is about the same.

Amplitude
The signal level or magnitude, referenced to zero. For balanced bipolar signals, half of the peak-to-peak value; half of: the positive peak value less the negative peak value.

Anagram
To form a recognizable word, phrase or message by rearranging letters. This is the permutation of letters to achieve meaning. Also see: multiple anagramming.

Analog
Pertaining to continuous values. As opposed to digital or discrete quantities.

Analogy
From Greek geometry, "according to a ratio." To "draw a parallel" between situations which seem to have some property in common. A mode of argumentation in which a known relationship or pattern is applied to a new situation. A form of inductive reasoning which supports the creation of new testable consequences.

When two things are related by appropriate similarity in structure or function, we can infer that what is known about one thing also may apply to the other. Such an inference may or may not be true, but it can be examined and tested.

Analysis
1. The decomposition of a whole into component parts. As opposed to synthesis. Also see: hypothesis, argumentation, scientific method, proof. Also see: tree analysis, fault tree analysis and risk analysis. Also see: cryptanalysis.
2. The branch of mathematics concerned with functions and limits. The calculus.

AND
A Boolean logic function which is also mod 2 multiplication.

ANF
Algebraic Normal Form.

Anode
In an electronic circuit element, the end which sources electrons and accepts conventional current flow. The (-) end of a charged battery. The (+) end of a battery being charged. The P side of a PN semiconductor junction. As opposed to cathode.

Antecedent
In the study of logic, the if-clause of an if-then statement. Also see: consequent

AONT
All or Nothing Transform.

Arc
In an FSM, a path segment which does not repeat. As opposed to a cycle. In any FSM of finite size, every arc must eventually lead to a cycle.

Argument
1. In mathematics and programming, a variable which is an input to a function. A parameter. An independent variable.
2. In the study of logic, reasons that support a conclusion. Technically, a sequence of statements (premises) followed by a conclusion statement. An argument is valid only if the conclusion necessarily follows from the premises, and may be invalid:
  • In material content, from misstatement of fact
  • In language wording, from misuse of terms
  • In formal structure, from misuse of logic.
Also see: fallacy, argumentation and burden of proof.

Argumentation
Argument applied to a case under discussion. Reasoned discourse to infer a conclusion. There are various classes:
  • Analytic: separation into component parts; formal logic
  • Synthetic: construction of a whole from component parts
  • Dialectic: discussion and debate by questions and answers; the use of conflicting ideas to explore the issue under discussion
  • Rhetoric: techniques of argument presentation; the art of persuasive speech
Also see: claim, extraordinary claims, fallacies, Socratic Method, sophistry, spin and lie.

In On Sophistical Refutations (350 B.C.E.), Aristotle (384-322 B.C.E.) lists five goals for countering arguments:

  1. Refutation -- to find fault in the argument or claim itself
  2. Fallacy -- to find fault in the logic of the argument
  3. Paradox -- to show the argument leads to logical contradiction
  4. Solecism -- to cause emotional outburst
  5. Babbling -- to cause repetition, lengthy explanation, or confusing distinctions

Refutation can occur in various ways. Disputing the evidence being used to support a claim can be considered a new claim and different evidence presented. However, disputing the reasoning itself requires only logic and typically no further evidence at all. See extraordinary claims.

Like cryptography, argumentation is war, and tricks abound when winning is the ultimate goal. But arguing to win is fundamentally unscientific, since learning occurs mainly when an error is found and recognized.

The first requirement of successful argumentation is to have a stated topic or thesis. Without a stated topic, an unscrupulous opponent can lead the argument to some apparently similar but more vulnerable issue, and few in the audience will notice. That is especially true when a topic is introduced casually, and then changed by the opponent in the very first response. Another approach for the opponent is to indignantly bring up and discuss in detail some supposed error on an irrelevant but apparently related topic. A clever topic change also may cause awkward repetition and babbling in the attempt to expose the change and reverse it. The correct response is to be aware enough to recognize the topic change immediately, and return to the original topic; to argue that the comments are off-topic is to introduce a new topic.

There is no way to make an opponent stay on-topic, and if they know they will lose on-topic, that actually may be impossible. Moreover, the opponent may pose various questions (on some new topic), and claim you are not being responsive, the discussion of that claim itself being a new topic. But if you want to take your topic to conclusion, you cannot follow an opponent who wants anything but that. (Also see spin.)

The second requirement of successful argumentation is to force the discussion to remain on the material content. If the original argument might be successful, an unscrupulous opponent may seek to divert the discussion to the appropriateness of, or bias in, the symbols or names used for the concept. Or the opponent may find and protest premises stated without mathematical precision. But a conventional argument need be neither mathematically complete nor mathematically precise to be valid. (This is the fallacy of accident.) The correct response is to point out that the comments are irrelevant and return to the material issue; to argue that the comments are wrong is to argue a changed topic.

How to Win

The goal of scientific argumentation is to improve knowledge and insight, not to anoint a "superior" contestant. Sadly, those willing to "win" with dishonesty generally do find an easily mislead audience.

Almost all on-line arguments are technically informal in the sense of depending upon context and definitions. The need for particular context generally leaves ample room to confuse the issue, even for someone who knows almost nothing about the topic.

If the proposed argument is basically unsound, that case can be won on its merits.

How to Attack

If the proposed argument is basically sound, but based on analogy, we need to realize that there are few really good analogies. Examine the analogy in detail and try various cases until one is found that is good in the analogy but bad in the proposed argument.

If the proposed argument is basically sound, one can win anyway by changing the topic and doing so in a smooth way the audience will not notice.

  1. Look for an invalid context. Generally an argument is valid only within a particular context. Find a different context where the argument does not work and "disprove" it on that basis. Or,
  2. Look for a source of confusion. Usually the argument will depend upon some particular words, either explicit or implied, that have multiple reasonable definitions. Choose one or more of those, and show how the argument fails with one of the other definitions. Typically, the opponent will reply and explain the proper context, which, of course, you already know.
  3. Look for an error. In the process of describing the context, the opponent generally will increase the number of word dependencies, which is just more material to exploit. However, if the explanation can itself be interpreted as error, you can expose that error and then claim the opponent is not only wrong, but demonstrably incompetent.

How to Defend

Most responses carry a least a thin patina of respectability. However, many times a response is actually just the first sad shot in a verbal combat that seeks defeat and winning by deception. Unfortunately, it may be difficult to distinguish between mere ignorance and actual attack. It is thus important to actually examine the logic of any response.

Arity
In mathematics and programming, the number of parameters or arguments to a function or operator. Also see unary, binary, monadic and dyadic.

ASCII
A public code for converting between 7-bit values 0..127 (or 00..7f hex) and text characters. ASCII is an acronym for American Standard Code for Information Interchange. Also see Base-64.
DEC HEX CTRL CMD    DEC HEX CHAR    DEC HEX CHAR   DEC HEX CHAR

  0  00  ^@  NUL     32  20  SPC     64  40  @      96  60  '
  1  01  ^A  SOH     33  21   !      65  41  A      97  61  a
  2  02  ^B  STX     34  22   "      66  42  B      98  62  b
  3  03  ^C  ETX     35  23   #      67  43  C      99  63  c
  4  04  ^D  EOT     36  24   $      68  44  D     100  64  d
  5  05  ^E  ENQ     37  25   %      69  45  E     101  65  e
  6  06  ^F  ACK     38  26   &      70  46  F     102  66  f
  7  07  ^G  BEL     39  27   '      71  47  G     103  67  g
  8  08  ^H  BS      40  28   (      72  48  H     104  68  h
  9  09  ^I  HT      41  29   )      73  49  I     105  69  i
 10  0a  ^J  LF      42  2a   *      74  4a  J     106  6a  j
 11  0b  ^K  VT      43  2b   +      75  4b  K     107  6b  k
 12  0c  ^L  FF      44  2c   ,      76  4c  L     108  6c  l
 13  0d  ^M  CR      45  2d   -      77  4d  M     109  6d  m
 14  0e  ^N  SO      46  2e   .      78  4e  N     110  6e  n
 15  0f  ^O  SI      47  2f   /      79  4f  O     111  6f  o
 16  10  ^P  DLE     48  30   0      80  50  P     112  70  p
 17  11  ^Q  DC1     49  31   1      81  51  Q     113  71  q
 18  12  ^R  DC2     50  32   2      82  52  R     114  72  r
 19  13  ^S  DC3     51  33   3      83  53  S     115  73  s
 20  14  ^T  DC4     52  34   4      84  54  T     116  74  t
 21  15  ^U  NAK     53  35   5      85  55  U     117  75  u
 22  16  ^V  SYN     54  36   6      86  56  V     118  76  v
 23  17  ^W  ETB     55  37   7      87  57  W     119  77  w
 24  18  ^X  CAN     56  38   8      88  58  X     120  78  x
 25  19  ^Y  EM      57  39   9      89  59  Y     121  79  y
 26  1a  ^Z  SUB     58  3a   :      90  5a  Z     122  7a  z
 27  1b  ^[  ESC     59  3b   ;      91  5b  [     123  7b  {
 28  1c  ^\  FS      60  3c   <      92  5c  \     124  7c  |
 29  1d  ^]  GS      61  3d   =      93  5d  ]     125  7d  }
 30  1e  ^^  RS      62  3e   >      94  5e  ^     126  7e
 31  1f  ^_  US      63  3f   ?      95  5f  _     127  7f  DEL

Associative
1. In abstract algebra, a dyadic operation in which two sequential operations on three arguments can first operate on either the first two or the last two arguments, producing the same result in either case: (a + b) + c = a + (b + c).
2. In algebra, the associative law for addition and multiplication. The algebraic law for evaluating the result of grouping terms or factors in different ways, as in conventional arithmetic:
a + (b + c) = (a + b) + c
a * (b * c) = (a * b) * c

Also see: commutative and distributive.

Assumption
In the study of logic, a statement accepted as true, but not proven. Typically a condition or premise.

In a mathematical proof, each and every assumption must be true for the proof result to be true. If the truth of any assumption is unknown, the proof is formally incomplete and the result has no meaning.

In practice, proofs have meaning only to the extent that each and every required assumption can be assured, including assumptions which may not be immediately apparent. In practical cryptography, while some assumptions possibly could be assured by the user, others could only be assured by the cipher designer, who must then be trusted, along with his company, the entire distribution path and so on. Even worse, still other assumptions may be impossible to assure in practice by any means at all, which makes any such proof useless for practical cryptography.

Asymmetric Cipher
A public key cipher.

Asynchronous
Not synchronous; not aligned in time. Digital logic systems in which signal changes are not related to an overall clock.

Also RS-232 and similar "serial port" signals, in which byte or character values are transferred bit-by-bit in bit-serial format. Since digital signals require both proper logic levels and proper timing to sense those levels, timing is established by the leading edge of a "start bit" sent at the start of each data byte. See asynchronous transmission.

Asynchronous Stream Cipher
A self-synchronizing stream cipher.

Asynchronous Transmission
The common RS-232 "serial port" signal, where data are sent on a single wire. This is complicated because a bit is not just a logic level, but also a time to sample that level. The necessary timing is established by the edge between a "high" stop bit or resting level, and a "low" start bit.

Transmit: The line rests "high." When a character is to be sent, a start bit or "low" level is sent for one bit-time. Then each data bit is sent, for one bit-time each, as are one or two stop or "high" level bit-times. Then, if no more data are ready for sending, the line just rests "high."

Receive: The line is normally "high." The instant the line goes "low" is the beginning of a start bit, and that establishes an origin for bit timing. Exactly 1.5 bit-times later, hopefully in the middle of the first data-bit time, the line level is sampled to record the first incoming bit. The second bit is recorded one bit-time later, and so on. When all bits have been recorded, the receiver sends the resulting character, all bits simultaneously, to a local register or FIFO queue for pickup. Note that all this implies that we know the format of the character with respect to bit time and number of bits.

Timing Accuracy: Everything depends upon both transmit and receive ends having approximately the same bit timing. The leading edge of the start bit temporarily synchronizes the receiver, even though the transmit and receive clock rates may be somewhat different. With 8-bit characters, the last data bit is sampled exactly 8.5 bit-times from the detected leading edge of the start bit. If the receive timing varies as much as +/- 0.5 bit in 8.5, the last bit will be sampled outside the correct bit time. So the total timing accuracy must be within +/- 5.8 percent, for all sources transmit and receive clock variation, including sampling delay in detecting the start bit. Nowadays this is easily achieved with cheap crystal oscillator clock modules and digital count logic.

Attack
General ways in which a cryptanalyst may try to "break" or penetrate the secrecy of a cipher. These are not algorithms; they are just approaches as a starting place for constructing specific algorithms.

In normal cryptanalysis we start out knowing plaintext, ciphertext, and cipher construction. The only thing left unknown is the key. A practical attack must recover the key. (Or perhaps we just know the ciphertext and the cipher, in which case a practical attack would recover plaintext.) Simply finding a distinguisher (showing that the cipher differs from the chosen model) is not, in itself, an attack. If an attack does not recover the key (or perhaps the particular key-selected internal state used in ciphering), it is not a real attack.

In cryptography, when someone says they have "an attack," the implication is that they have a successful attack (a break) and not just another failed attempt. It is obviously much easier to simply claim to have an attack than to actually analyze, innovate, build and test a working attack, which makes it necessary to back up such claims with evidence. Arrogant claims, with "proof left as an exercise for the student" or "read the literature" responses, deserve jeers instead of the cowed respect they often get.

A claim to have an attack can be justified by:

  1. describing the process in such detail that others can understand it and could use it to break the cipher,
  2. actually performing the attack in practice and showing the claimed results (e.g., finding the unknown key, given known plaintext), or
  3. demonstrating the ability to do something fundamental which should be impossible (like finding several strings which each have the same cryptographic hash result).
Simply finding a distinguisher is not in itself sufficient to expose keying or plaintext as required of a real attack.

It is not sufficient to say: "My interpretation of the theory is that there must be a break, so the cipher is broken"; it is instead necessary to actually devise a process which recovers key or plaintext. Furthermore, there are many attacks which work against scaled-down tiny ciphers, but which do not scale up as valid attacks against the original large cipher: Just because we can solve newspaper-amusement ciphers (tiny versions of conventional block ciphers) does not imply that any real-size block ciphers are "broken." The process used to solve newspaper ciphers is not "an attack" on block ciphers in general.

Classically, attacks were neither named nor classified; there was just: "here is a cipher, and here is 'the' attack." (Many different attacks may be possible, but even one practical attack is sufficient to cause us to avoid that cipher.) And while this gradually developed into named attacks, there is no overall attack taxonomy. Currently, attacks are often classified by the information available to the attacker or constraints on the attack, and then by strategies which use the available information. Not only ciphers, but also cryptographic hash functions can be attacked, generally with very different strategies.

Information Constraints

We are to attack a cipher which enciphers plaintext into ciphertext or deciphers the opposite way, under control of a key. The available information necessarily constrains our attack strategies.

  • Ciphertext Only: We have only ciphertext to work with. Sometimes the statistics of the ciphertext provide insight and can lead to a break.
  • Known Plaintext: We have some, or even an extremely large amount, of plaintext and the associated ciphertext.
  • Defined Plaintext: We can submit arbitrary messages to be ciphered and capture the resulting ciphertext. (Also Chosen Plaintext and Adaptive Chosen Plaintext.) (A subset of Known Plaintext.)
  • Defined Ciphertext: We can submit arbitrary messages to be deciphered and see the resulting plaintext. (Also Chosen Ciphertext and Adaptive Chosen Ciphertext.)
  • Chosen Key: We can specify a change in any particular key bit, or some other relationship between keys.
  • Timing: We can measure the duration of ciphering operations and use that to reveal the key or data.
  • Fault Analysis: We can induce random faults into the ciphering machinery, and use those to expose the key.
  • Man-in-the-Middle: We can subvert the routing capabilities of a computer network, and pose as the other side to each of the communicators. (This is a key authentication attack on public key systems.)

Attack Strategies

The goal of an attack is to reveal some unknown plaintext, or the key (which will reveal the plaintext). An attack which succeeds with less effort than a brute-force search we call a break. An "academic" ("theoretical," "certificational") break may involve impractically large amounts of data or resources, yet still be called a "break" if the attack would be easier than brute force. (It is thus possible for a "broken" cipher to be much stronger than a cipher with a short key.) Sometimes the attack strategy is thought to be obvious, given a particular informational constraint, and is not further classified.

  • Brute Force (also Exhaustive Key Search): Try to decipher ciphertext under every possible key until readable messages are produced. (Also "brute force" any searchable-size part of a cipher.)
  • Codebook (the classic "codebreaking" approach): Collect a codebook of transformations between plaintext and ciphertext.
  • Differential Cryptanalysis: Find a statistical correlation between key values and cipher transformations (typically the Exclusive-OR of text pairs), then use sufficient defined plaintext to develop the key.
  • Linear Cryptanalysis: Find a linear approximation to the keyed S-boxes in a cipher, and use that to reveal the key.
  • Meet-in-the-Middle: Given a two-level multiple encryption, search for the keys by collecting every possible result for enciphering a known plaintext under the first cipher, and deciphering the known ciphertext under the second cipher; then find the match.
  • Key Schedule: Choose keys which produce known effects in different rounds.
  • Birthday (usually a hash attack): Use the birthday paradox, the idea that it is much easier to find two values which match than it is to find a match to some particular value.
  • Formal Coding (also Algebraic): From the cipher design, develop equations for the key in terms of known plaintext, then solve those equations.
  • Correlation: In a stream cipher, distinguish between data and confusion, or between different confusion streams, from a statistical imbalance in a combiner.
  • Dictionary: Form a list of the most-likely keys, then try those keys one-by-one (a way to improve brute force).
  • Replay: Record and save some ciphertext blocks or messages (especially if the content is known), then re-send those blocks when useful.

Many attacks try to isolate unknown small components or aspects so they can be solved separately, a process known as divide and conquer. Also see: security.

Attack Tree
Supposedly, a formal way to analyze security in a cipher system. A method of cryptanalysis. Based on a threat model, and related to classic risk analysis as a specific kind of tree analysis.

A network in the form of a tree is used, with goals represented as nodes. Various possible ways to achieve a particular goal are represented as branches, which then can be taken as goals with their own branch nodes.

In cryptographic analysis, the idea is that the root node will represent the ultimate security we seek. Each path to the root then represents the accumulated effort needed to break that security. The problem is that it is typically impossible to assure that every alternative attack has been considered. And if some unconsidered approach is cheaper than any other, that becomes the true limit on security, despite not being present in the analysis.

Attack tree analysis does not tend to expose unconsidered attacks. Yet those are exactly the issues which carry the greatest cryptographic risk, because we can at least generally quantify the risk from known attacks. Since an attack tree cannot do what most needs to be done, it would seem to be a strange choice for cryptographic risk analysis. One could even argue that an attack tree is most useful as a formal aid in deluding naive executives and users.

Threat models basically concern what is to be protected, from whom, and for how long. But with ciphers, we seek to protect all our data, from everyone, forever. The extreme nature of these expectations is only part of what makes a conventional threat model unhelpful in understanding ciphering risks.

Cipher failure and exploitation happens in secret, so we cannot know how often it occurs and cannot develop a probability for it. Absent a probability of cipher failure, any attempt to understand ciphering risk is necessarily limited.

A more effective approach to system security is to build with understandable components. In component design, we can define exactly what each component permits. In component analysis, we can consider the security effects and expose the precise range of things each component allows. If none of the allowed things can cause a security problem, we will have no security problems. Components essentially become a custom language of system design which has no way of expressing security faults.

A component-based security design is far more restrictive and so is far more demanding than the conventional mode of hacking through a design and implementation. However, this design process provides a road map for real security, as opposed to belief in results from flawed analytical tools (like attack trees) and ad hoc analysis that simply cannot deliver the assurances we need.

The ability to analyze security must be designed into a system; it cannot be just added on to finished systems.

Augmented Repetitions
When sampling with replacement, eventually we again find some object or value which has been found before. We call such an occurrence a "repetition." A value found exactly twice is a double, or "2-rep"; a value found three times is a triple or "3-rep," and so on.

For a known population, the number of repetitions expected at each level has long been understood to be a binomial expression. But if we are sampling in an attempt to establish the effective size of an unknown population, we have two problems:

  1. The binomial equations which predict expected repetitions do not reverse well to predict population, and
  2. Exact repetitions discard information and so are less accurate than we would like. For example, if we have a double and then find another of that value, we now have a triple, and one less double. So if we are using doubles to predict population, the occurrence of a triple influences the predicted population in exactly the wrong direction.

Fortunately, there is an unexpected and apparently previously unknown combinatoric relationship between the population and the number of combinations of occurrences of repeated values. This allows us to convert any number of triples and higher n-reps to the number of 2-reps which have the same probability. So if we have a double, and then get another of the same value, we have a triple, which we can convert into three 2-reps. The total number of 2-reps from all repetitions (the augmented 2-reps value) is then used to predict population.

We can relate the number of samples s to the population N through the expected number of augmented doubles Ead:

     Ead(N,s) = s(s-1) / 2N .
This equation is exact, provided we interpret all the exact n-reps in terms of 2-reps. For example, a triple is interpreted as three doubles; the augmentation from 3-reps to 2-reps is (3 C 2) or 3. The augmented result is the sum of the contributions from all higher repetition levels:
           n    i
     ad = SUM  ( ) r[i] .
          i=2   2
where ad is the number of augmented doubles, and r[i] is the exact repetition count at the i-th level.

And this leads to an equation for predicting population:

     Nad(s,ad) = s(s-1) / 2 ad .
This predicts the population Nad as based on a mean value of augmented doubles ad. (For an example and comparison to various other methods, see the conversation: Clearly, we expect the number of samples to be far larger than the number of augmented doubles, but an error in the augmented doubles ad should produce a proportionally similar error in the predicted population Nad. We typically develop ad to high precision by averaging the results of many large trials.

However, since the trials should have approximately a simple Poisson distribution (which has only a single parameter), we could be a bit more clever and fit the results to the expected distribution, thus perhaps developing a bit more accuracy. Also see population estimation, birthday attack, birthday paradox and entropy.

Also see

Authentication
One of the objectives of cryptography: Typically, assurance that a message was sent by the purported author. This is message authentication and is sometimes assumed into message integrity.

It is possible to authenticate individual blocks, provided they are large enough to minimize the impact of adding extra authentication data in each block (see block code). One advantage lies in avoiding the alternative of buffering an entire message before it can be authenticated. That can be especially important for real-time (e.g., voice) communications.

Other forms of cryptographic authentication include key authentication for public keys, and source or user authentication, for the authorization to send the message in the first place.

Also see certification and certification authority.

Authenticating Block Cipher
A block cipher mechanism which inherently contains an authentication value or field. (Also see homophonic substitution.)

Authority
1. The right to demand acceptance.
2. A recognized, appointed, or certified expert responsible for statements or conclusions.

Science does not recognize mere authority as sufficient basis for a conclusion, but instead requires that facts and reasoning be exposed for review. The simple use of a name does not automatically create an ad verecundiam fallacy ("Appeal to Awe"). A name can identify a body of work giving the needed facts and the reasoning supporting a scientific conclusion.

Authority tends to hide the basis for drawing conclusions. Authority tends to avoid addressing complaints of false reasoning. Authority tends to hide reasoning and insists that a statement is correct simply because of who made it. A person repeating a conclusion from an authority often has no idea of the reasoning behind it, or what it really means with respect to limits or context.

In contrast, scientific thought exposes the factual basis and the reasoning, which tells us what the conclusion really means. Scientific thought is democratic and informs, and ideally gives everyone the same materials from which to draw factual conclusions, some of which may be new, strange and disconcerting, but nevertheless correct.

Autocorrelation
In statistics, the linear correlation of a sequence to itself ("auto"). The extent that any particular sample linearly predicts subsequent and prior samples (typically a different value for each positive or negative delay).

AUTODIN
AUTOmatic DIgital Network. A message switching network for the U.S. military, fielded in the late 1960's. Typically secure on each link between large secure facilities called ASC's (Automatic Switching Centers), with internal computer message switching or forwarding between links. Replaced by the MilNet packet switching network in the early 1990's.

Autokey
In cryptography, a stream cipher which generates the next keystream value from earlier elements elements in that same keystream ("auto"). Typically, the ciphertext output is fed back to modify the state of the random number generator producing the running key or confusion sequence.

Automorphism
In abstract algebra, the special case of an isomorphism where the field or group is mapped onto itself.

AUTOSEVOCOM
AUTOmatic SEcure VOice COMmunications.

Availability
One of the possible goals of cryptography. Keeping services open for use.

Avalanche
The observed property of a block cipher constructed in layers or "rounds" with respect to a tiny change in the input. The change of a single input bit generally produces multiple bit-changes after one round, many more bit-changes after another round, until, eventually, about half of the block will change. An analogy is drawn to an avalanche in snow, where a small initial effect can lead to a dramatic result. As originally described by Feistel:
"As the input moves through successive layers the pattern of 1's generated is amplified and results in an unpredictable avalanche. In the end the final output will have, on average, half 0's and half 1's . . . ." [p.22] -- Feistel, H. 1973. Cryptography and Computer Privacy. Scientific American. 228(5):15-23.

Also see mixing, diffusion, overall diffusion, strict avalanche criterion, complete, S-box. Also see the bit changes section of the "Binomial and Poisson Statistics Functions in JavaScript," locally, or @: http://www.ciphersbyritter.com/JAVASCRP/BINOMPOI.HTM#BitChanges.

Avalanche Effect
The result of avalanche. As described by Webster and Tavares:
"For a given transformation to exhibit the avalanche effect, an average of one half of the output bits should change whenever a single input bit is complemented." [p.523] -- Webster, A. and S. Tavares. 1985. On the Design of S-Boxes. Advances in Cryptology -- CRYPTO '85. 523-534.

Also see the bit changes section of the "Binomial and Poisson Statistics Functions in JavaScript" page (locally, or @: http://www.ciphersbyritter.com/JAVASCRP/BINOMPOI.HTM#BitChanges).

Avalanche Multiplication
In semiconductor electronics, the dominant reverse-voltage breakdown mode in so-called "zener" diodes and other PN junctions with breakdown voltages over 6 or 8 volts. A source of shot noise and additional noise from avalanche which can be much greater than shot noise.

In normal junctions, the space-charge region (depletion region) between P and N materials is fairly broad, so the extreme fields found in Zener breakdown do not occur. However, a combination of applied voltage, temperature, and random motion may cause a covalent bond to break anyway, in a manner similar to normal diode leakage. When a breakdown does occur, the charge carrier is attracted by the opposing potential and drops through the space-charge region, periodically interacting with covalent bonds there. When the field is sufficiently high, a falling charge carrier may build up enough energy to break another carrier free when it hits. Then both the original and resulting carriers continue to accelerate through the space-charge region, each possibly hitting and breaking many other bonds. The result is a growing avalanche of carriers produced by each single breakdown. The avalanche effect can be seen as a form of amplification and can be huge, for example, 10**8.

In a series of almost forgotten semiconductor physics research papers from the 1950's and 1960's, avalanching breakdown was shown to consist of a multitude of "microplasma" events of perhaps 20uA each. These events are not completely independent, but instead interact, but also have some apparently random component, probably thermal. At least some of the microplasma events seem to have negative dynamic resistance and function like tiny like neon bulbs (and may even emit light). One implication if this is an ability of some avalanching "zener" diodes to directly support small, unsuspected oscillations. A series of very extensive discussions on sci.electronics.design in 1997 (search: "zener oscillation") give experimental details. Both LC tank oscillation and RC relaxation oscillation were demonstrated in practice. Thus, avalanche multiplication, often assumed to be unquestionably "quantum random," actually may have a disturbing amount of predictable structure. True Zener breakdown does not appear to have the same problems, nor does thermal noise, as far as we know. Unfortunately, these "purer" sources may be much smaller than noise from avalanche multiplication.

In contrast to Zener breakdown, which has a negative temperature coefficient, avalanche multiplication has a positive temperature coefficient, like most resistances or conductors. Presumably this is due to heat causing increased activity in the crystal lattice, thus preventing electrons from falling as far before interacting, thus reducing the probability of breaking another bond, and reducing the amplification. In junctions that break down at about 6 volts the temperature effects tend to cancel. Also see: "Random Electrical Noise: A Literature Survey" (locally, or @: http://www.ciphersbyritter.com/RES/NOISE.HTM).

Axiom
In the study of logic and argument, a statement (premise) assumed true without proof. A postulate.

Back Door
A cipher design fault, planned or accidental, which allows the apparent strength of the design to be easily avoided by those who know the trick. When the design background of a cipher is kept secret, a back door is often suspected. Also see: trap door.

Balance
1. Equal on each side. The same number of each kind of symbol. Also see bit balance.
2. A term used in S-box and Boolean function analysis. As described by Lloyd:
"A function is balanced if, when all input vectors are equally likely, then all output vectors are equally likely." -- Lloyd, S. 1990. Properties of binary functions. Advances in Cryptology -- EUROCRYPT '90. 124-139.

There is some desire to generalize this definition to describe multiple-input functions. (Is a dyadic function "balanced" if, for one value on the first input, all output values can be produced, but for another value on the first input, only some output values are possible?) Presumably a two-input balanced function would be balanced for either input fixed at any value, which would essentially be a Latin square or a Latin square combiner. Also see Balanced Block Mixing. As opposed to bias. Also see Ideal Secrecy and Perfect Secrecy.

Balance is a pervasive requirement in many areas of cryptography; for example:

Balanced Block Mixer
A process or any implementation (for example, hardware, computer software, hybrids, or the like) for performing Balanced Block Mixing.

Balanced Block Mixing
(BBM). Balanced Block Mixing. The ciphering concept described in U.S. Patent 5,623,549. See

A mechanism for mixing large block values like those used in block ciphers. A BBM is balanced to avoid leaking information, and is effective in just a single pass, thus avoiding the need for repeated rounds and added hardware. A BBM has no data expansion. A BBM supports the construction of scalable ciphers with large blocks, and can be more efficient, more flexible, and more useful than conventional fixed and smaller designs. (See: ideal mixing, Mixing Cipher, Mixing Cipher design strategy and also the BBM articles, locally, or @: http://www.ciphersbyritter.com/index.html#BBMTech).

Technically, a Balanced Block Mixer is an m-input-port m-output-port mechanism with various properties:

  1. The overall mapping is one-to-one and invertible: Every possible input value (over all ports) to the mixer produces a different output value (including all ports), and every possible output value is produced by a different input value;
  2. Each output port is a function of every input port;
  3. Any change to any one of the input ports will produce a change to every output port;
  4. Stepping any one input port through all possible values (while keeping the other input ports fixed) will step every output port through all possible values.

The inverse mixing behaves similarly. Say, for example, we are mixing 64 bytes of message into 64 bytes of result: If we know 63 of the result bytes, we can step through the values of the 64th byte, and get 256 different messages, each of which will produce the 63 bytes we know (a homophonic sort of situation). If the actual messages are random-like and evenly distributed, it will be difficult to know which particular message is implied. The amount of uncertainty we have in the result is reflected in the amount of uncertainty we have about the message.

The Basic Balanced Block Mixer

The basic Balanced Block Mixer is a pair of orthogonal Latin squares. The two input ports affect the rows and columns of both squares, with the selected result in each square being the two output ports. For example, here is a tiny nonlinear "2-bit" or "order 4" BBM:

   3 1 2 0   0 3 2 1       30  13  22  01
   0 2 1 3   2 1 0 3   =   02  21  10  33
   1 3 0 2   1 2 3 0       11  32  03  20
   2 0 3 1   3 0 1 2       23  00  31  12

Suppose we wish to mix (1,3); 1 selects the second row up in both squares, and 3 selects the rightmost column, thus selecting (2,0) as the output. Since there is only one occurrence of (2,0) among all entry pairs, this discrete mixing function is reversible, as well as being balanced on both inputs.

In practice, we would probably want to use at least order 16, which can be efficiently stored as an ordinary 256-byte "8-bit" substitution table, one with a particular oLs structure in the data.

Scalable Linear Balanced Block Mixers

One way to use the BBM mixing concept is to develop linear equations for oLs mixing for scaling to various sizes (see my article: Fencing and Mixing Ciphers from 1996 Jan 16). We can do that in the finite field of mod-2 polynomials with an irreducible modulus. So we can easily have similar mixers of 16, 32, 64, 128 and 256 bit port widths, and so on. By using multiple mixers of different size in various connections, we can easily mix blocks of size compatible to existing ciphers, and much larger.

Explicit Nonlinear Balanced Block Mixers

A usually better way to use the BBM mixing concept is to develop small, nonlinear and keyed oLs's for use in FFT-like patterns with 2n ports. It is easy to construct keyed nonlinear orthogonal pairs of Latin squares of arbitrary 4n order as I describe in my articles:

In any FFT-style structure, there is exactly one "path" from any input to any output, and "cancellation" cannot occur. Thus, we can guarantee that any change to any one input must "affect" each and every output. Similarly, each input is equally represented in each output, which is ideal mixing. The resulting wide ideal mixing structure, using small BBM tables as each butterfly operation, is itself a BBM, and is dynamically scalable to virtually arbitrary size.

Scalable Mixing Advantages

Mixing has long been a problem in block ciphers. The difficulty of mixing wide block values is one reason most conventional block ciphers are small. But having a small block means that there is not much room to add features like:

on each block (also see block coding and huge block cipher advantages). Per-block authentication, for example, allows blocks to be used in real time, or delivered out of order, without first buffering and authenticating the entire message.

Large blocks also have room to hold sufficient uniqueness to support electronic codebook mode, which is not normally appropriate for block ciphers. Large blocks in ECB mode can support secure ciphering without ciphertext expansion, a goal which is very hard to reach in other ways.

When a BBM is implemented in software, the exact same unchanged routine can handle both wide mixing for real operation and narrow "toy" mixing for thorough experimental testing. This supports both scalable operation, and exhaustive testing of the exact code used in actual operation.

In hardware, BBM block throughput or block rate can be independent of block size. Wide blocks can be mixed in the same time as narrow blocks by pipelining each sub-layer of the mixing. That of course makes large blocks far faster per byte than small ones.

Also see All or Nothing Transform, Mixing Cipher, Dynamic Substitution Combiner, and Variable Size Block Cipher.

Also see some of the development sequence:

Balanced Combiner
In the context of cryptography, a combiner mixes two input values into a result value. A balanced combiner must provide a balanced relationship between each input and the result.

In a statically-balanced combiner, any particular result value can be produced by any value on one input, simply by selecting some appropriate value for the other input. In this way, knowledge of only the output value provides no information -- not even statistical information -- about either input.

The common examples of cryptographic combiner, including byte exclusive-OR (mod 2 polynomial addition), byte addition (integer addition mod 256), or other "additive" combining, are perfectly balanced. Unfortunately, these simple combiners are also very weak, being inherently linear and without internal state.

A Latin square combiner is an example of a statically-balanced reversible nonlinear combiner with massive internal state. A Dynamic Substitution Combiner is an example of a dynamically or statistically-balanced reversible nonlinear combiner with substantial internal state.

Balanced Line
In electronics, typically an interconnecting cable with two conductors having an exactly opposite or symmetric signal on each. More specifically, a two-wire signal interconnection where each wire has the same impedance to ground or any other conductor. In contrast to unbalanced line (like coaxial cable), where one of the conductors (the shield) has a low impedance to ground. In all cases, two conductors are required to transport a loop of current carrying a signal.

Each conductor of a balanced line systems should have similar driver output impedances (ideally low), similar wire effects, and similar receiver termination impedances (ideally high). At audio frequencies cables are not transmission lines, so "cable impedance" is not an issue, and the differential receiver need not match either the cable or the driver. When each wire has a similar impedance to ground, external magnetic and electrostatic fields should act on them similarly, producing a common effect on each wire which can "cancel out."

A transformer winding makes a good balanced line driver. In contrast, operational amplifier circuits with direct outputs probably will have only roughly-similar output impedances. Output resistors (e.g., 100 ohms) typically isolate each op amp output from the cable, and any difference will represent driver imbalance to external noise. After being transported, the differential mode signal is taken between the two conductors, thus ignoring common mode noise. A transformer winding makes a good differential receiver and also provides ground loop isolation. Operational amplifier receivers need a common-mode-rejection null adjustment for best performance.

At audio frequencies, the main advantage of balanced line is rejection of AC hum and related power noises. This can be achieved by driving only one line with the desired audio signal, provided both lines are terminated similarly both in the driver and receiver.

At radio frequencies, balanced line also minimizes undesired signal radiation. When the current changes in each wire are equal but opposite, they radiate "out of phase," resulting in cancellation. This is especially useful in TEMPEST, but does require that both lines be actively driven.

Base-64
A public code for converting between 6-bit values 0..63 (or 00..3f hex) and text symbols accepted by most computers. Also see ASCII.
        0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f

   0    A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P
   1    Q  R  S  T  U  V  W  X  Y  Z  a  b  c  d  e  f
   2    g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v
   3    w  x  y  z  0  1  2  3  4  5  6  7  8  9  +  /

   use "=" for padding

Base Spreading Resistance
In a bipolar transistor, the internal resistance between the base lead and the functioning base element. Often denoted Rbb' or rb' and having a typical value of perhaps 100 ohms. A major source of thermal noise in a bipolar transistor.

A bipolar transistor is made by diffusing impurities into a thin slice of extremely pure single-crystal semiconductor, such as silicon. Typically, the collector contact is made at the top surface, and the emitter contact is made on the bottom. The base element is essentially a thin film situated between the collector and emitter plates. The base current must flow on the film, which is naturally more resistive than the other thicker elements.

BBM
Balanced Block Mixing.

BBS
Bulletin Board System.

BB&S
Blum, Blum and Shub. A famous article developing a "provably secure" RNG.
Blum, L., M. Blum and M. Shub. 1983. Comparison of Two Pseudo-Random Number Generators. Advances in Cryptology: CRYPTO '82 Proceedings. Plenum Press: New York. 61-78.

Blum, L., M. Blum and M. Shub. 1986. A Simple Unpredictable Pseudo-Random Number Generator. SIAM Journal on Computing. 15:364-383.

The BB&S RNG is basically a simple squaring of the current seed (x) modulo a composite (N) composed of two primes (P,Q) of public key size. Primes P and Q must both be congruent to 3 mod 4, but the BB&S articles say that P and Q also must be special primes. The special primes construction apparently has the advantage of controlling the cycle structure of the system, and is part of the BB&S design in the original articles. Unfortunately, the special primes construction generally is not presented in current texts. Instead the texts deceptively describe a simplified version which they nevertheless call BB&S. Readers who do not study referenced articles will assume they know what BB&S said, but they are only partly correct.

Unlike more common RNG's, the BB&S construction is not maximal length, but instead defines systems with multiple cycles, including degenerate, short and long cycles. With large integer factors, state values on short cycles are very rare, but do exist. Short cycles are dangerous with any RNG, because when a RNG sequence begins to repeat, it has just become predictable, despite any theorems to the contrary. Consequently, if we key BB&S by choosing x[0] at random, we may unknowingly select a weak short cycle (a weak key), which would make the sequence predictable as soon as the cycle starts to repeat.

The original BB&S articles lay out the technology to compute the exact length of a long-enough cycle in the BB&S system. Since it can be much easier to verify cycle length than to actually traverse the cycle, this is a practical way to verify that x[0] selects a long-enough cycle. Values of x[0] can be chosen and checked until a long cycle is selected. Modern cryptography insists, to the point of strident intimidation, that such verification is unnecessary. However, the original authors apparently thought it was important enough to include in their work.

The real issue here is not the exposure of a particular weakness in BB&S, since choosing x[0] on a short cycle is very unlikely. But "unlikely" is not the same as "impossible." And if the design goal is to eliminate every known weakness, even extensive math which concludes "that particular weakness is too unlikely to worry about" is beside the point: "unlikely" does not satisfy the goal. Mathematics does not get to impose goals on designers or users.

BB&S is said to be "proven secure" in the sense that if factoring is hard, then the sequence is unpredictable. And many people do think that factoring large composites of public key size is hard. Yet when a short cycle is selected and used, BB&S is obviously insecure, and that is a direct contradiction for anyone who imagines that "proven secure" applies to them. Just knowing the length of a cycle (by finding sequence repetition) should be enough to expose the factors. This is also evidence that the assumption that factoring is hard is not universally true. Of course, we already know that factoring is not hard -- when we give away the factors. And giving away the factors is pretty much what we do if we allow ourselves to select, use and traverse a short cycle. For other comments on the proof, see the sci.crypt conversations:

The advantage of the special primes construction apparently is that all "short" (but not degenerate) cycles are "long enough" for use. Thus, we can simply choose x[0] at random, and then easily test that it is not on a degenerate cycle. (Just get some x[0], step x[0] to x[1], save x[1], step x[1] to x[2], then compare x[2] to x[1] and if they are the same, start over.) The result is a guarantee that the selected cycle is "long enough" for use. See the sci.crypt discussion:

Note that I now recommend taking two steps before checking for a degenerate cycle.

It is sometimes said that the special primes construction adds nothing to BB&S, but that really depends more on the goals of the cipher designer than the math. Since BB&S is very slow in comparison to other RNG's, someone selecting BB&S clearly has decided to pay a heavy toll with the expectation of getting an RNG which is "proven secure" in practice. (That actually misrepresents the BB&S proof, which apparently allows weakness to exist provided it is not an easy way to factor N.) The obvious goal is to get a practical RNG which has no known weakness at all.

No mere proof can protect us when we ourselves choose and use a weak key, even if doing that is shown to be statistically very unlikely. And if we do use a weak key, the "proven secure" RNG is clearly insecure, which surely contradicts the motive for using BB&S in the first place. In contrast, simply by using the special primes construction and checking for degenerate cycles, weak keys can be eliminated, at modest expense. Eliminating a known possibility of weakness, even if that possibility is very small, seems entirely consistent with the goal of achieving a practical RNG with no known weakness, even if the result is not an RNG proven to have absolutely no weakness at all.

Some would say that even the special primes construction is overkill, but without it the so-called "proof of strength" becomes a mere wish or hope that a short cycle is not being used, and I see that as a contradiction. It also might be a cautionary tale as to what mathematical cryptography currently accepts as proof, and as to what such "proof" means in practical use. For other examples of failure in the current cryptographic wisdom, see one time pad, and AES (as an example of the size of the permutation family in real conventional block ciphers), and, of course, old wives' tale. Also see algorithm.

Bel
The base-10 logarithm of the ratio of two power values (which is also the same as the difference between the log of each power value). The basis for the more-common term decibel: One bel equals 10 decibels.

Belief
A subjective conviction in the truth of a proposition. Mere belief is in contrast to a hypothesis, which states some expectation of fact and so may be verified or falsified by observation and comparison in an experiment or just reality itself. Also see dogma.

Ordinarily we distinguish mere belief from proven truth, belief thus implying something less than conclusive evidence. In this sense, to believe is to be willing to accept unproven or even unprovable assumptions, such as having faith, or trusting in some machine or property. One issue is whether such assumptions or trust is reasonable in the real world.

Limiting what one can or should believe seems intertwined with freedom of speech and individual rights: Surely, anyone can believe what they want. However, to the extent that we have real responsibilities to others and society at large, unfounded belief can not uphold those obligations. In a seminal essay called "The Ethics of Belief" (circa 1877 and reprinted on the web), William Clifford shows how unfounded belief is insufficient support for decisions of life and death and reputation. Many of us would extend that to business planning (the recent Waltzing with Bears by DeMarco and Lister (see risk management) reprints the first section of "The Ethics of Belief" as an appendix), as well as scientific discussions and claims.

In that point of view, claiming something is true, when one has not investigated the topic and does not know, is ethically wrong, even if the claim it turns out (by pure dumb luck) to be correct. It is not enough to claim something and hope it works out; it is instead necessary to know that the claim is correct before making the claim. The ethical requirement is to have performed an investigation sufficient to expect to know one way or another, and come to a rationally supportable conclusion. While not rising to the level of known fact, belief is something on which reputation rests. Being wrong thus has consequences to reputation, provided the error is in the essence and not mere correctable detail.

This idea of requiring substantial investigation to come to a belief may seem to conflict with the scientific method, in that a scientist seemingly makes a mere claim, which generally stands until shown false. But in reality we expect that claim to be something beyond "mere." We demand that a scientific investigator have put sufficient professional effort into a conclusion before using a scientific podium to spout off. The investigation is what provides an ethical basis for belief, which still may be wrong or (more likely) incomplete.

For example, scientific publication does not mean that all of science supports the described conclusions, which are still just claims made by particular scientists. Showing (not necessarily proving) a claim to be wrong is part of the process of science, not unwarranted intrusion. Showing someone wrong in this context naturally affects reputation, but rarely results in absolute ruin.

The process of experimentation involves making "claims," often to be disproven, but those are clearly labeled hypotheses for experiment, not conclusions for use by others.

In contrast, when we have conclusive evidence of truth we have knowledge and fact instead of belief. Facts do not require belief, nor do they respond to voting or authority. Clearly, science depends upon knowledge and fact, not personal beliefs, and it is crucial to know the difference. Also see: scientific method, extraordinary claims and rhetoric.

Bent Function
A Boolean function whose fast Walsh transform has the same absolute value in each term (except, possibly, the zeroth). This means that the bent function has the same distance from every possible affine Boolean function.

We can do FWT's in "the bottom panel" at the end of my: "Active Boolean Function Nonlinearity Measurement in JavaScript" page, locally, or @: http://www.ciphersbyritter.com/JAVASCRP/NONLMEAS.HTM.

Here is every bent sequence of length 4, first in {0,1} notation, then in {1,-1} notation, with their FWT results:

   bent {0,1}       FWT           bent {1,-1}       FWT

   0  0  0  1    1 -1 -1  1        1  1  1 -1    2  2  2 -2
   0  0  1  0    1  1 -1 -1        1  1 -1  1    2 -2  2  2
   0  1  0  0    1 -1  1 -1        1 -1  1  1    2  2 -2  2
   1  0  0  0    1  1  1  1       -1  1  1  1    2 -2 -2 -2
   1  1  1  0    3  1  1 -1       -1 -1 -1  1   -2 -2 -2  2
   1  1  0  1    3 -1  1  1       -1 -1  1 -1   -2  2 -2  2
   1  0  1  1    3  1 -1  1       -1  1 -1 -1   -2 -2  2 -2
   0  1  1  1    3 -1 -1 -1        1 -1 -1 -1   -2  2  2  2
These sequences, like all true bent sequences, are not balanced. Literature references on this point include:
"Let Qn = {1,-1}n. The defining property of a bent sequence x in Qn is that the Hadamard transform of x has constant magnitude."
"Let y be a bent sequence over {0,-1}n. . . . "The Hamming weight of y is 22k-1 (+ or -) 2k-1."
-- Adams, C. and S. Tavares. 1990. Generating and counting binary bent sequences. IEEE Transactions on Information Theory. IT-36(5):1170-1173.
"Bent functions, except for the fact that they are never balanced, exhibit ideal cryptographic properties."
-- Chee, S., S. Lee, K. Kim. 1994. Semi-bent functions. Advances in Cryptology -- ASIACRYPT '94 107-118.
". . . it has often to be considered as a defect from a cryptographic point of view that bent functions are necessarily non-balanced."
-- Dobbertin, H. 1994. Construction of Bent Functions and Balanced Boolean Functions with High Nonlinearity. K.U. Leuven Workshop on Cryptographic Algorithms (Fast Software Encryption). 61-74.
"Example 23 (Bent Functions Are Not Balanced) . . . ."
-- Seberry, J. and X. Zhang. "Hadamard Matrices, Bent Functions and Cryptography." The University of Wollongong. November 23, 1995.

The zeroth element of the {0,1} FWT is the number of 1's in the sequence.

Here are some bent sequences of length 16:

   bent {0,1}    0 1 0 0   0 1 0 0   1 1 0 1   0 0 1 0
    FWT          6,-2,2,-2,2,-2,2,2,-2,-2,2,-2,-2,2,-2,-2
   bent {1,-1}   1 -1 1 1   1 -1 1 1   -1 -1 1 -1   1 1 -1 1
    FWT          4,4,-4,4,-4,4,-4,-4,4,4,-4,4,4,-4,4,4

   bent {0,1}    0 0 1 0   0 1 0 0   1 0 0 0   1 1 1 0
    FWT          6,2,2,-2,-2,2,-2,2,-2,-2,-2,-2,2,2,-2,-2
   bent {1,-1}   1 1 -1 1   1 -1 1 1   -1 1 1 1   -1 -1 -1 1
    FWT          4,-4,-4,4,4,-4,4,-4,4,4,4,4,-4,-4,4,4

Bent sequences are said to have the highest possible uniform nonlinearity. But, to put this in perspective, recall that we expect a random sequence of 16 bits to have 8 bits different from any particular sequence, linear or otherwise. That is also the maximum possible nonlinearity, and here we actually get a nonlinearity of 6.

There are various more or less complex constructions for these sequences. In most cryptographic uses, bent sequences are modified slightly to achieve balance.

Berlekamp-Massey
The algorithm used to find the shortest LFSR which can reproduce a given sequence. This length is the linear complexity of the sequence.
Massey, J. 1969. Shift-Register Synthesis and BCH Decoding. IEEE Transactions on Information Theory. IT-15(1):122-127.

Bernoulli Trials
In statistics, observations or sampling with replacement which has exactly two possible outcomes, typically called "success" and "failure." Bernoulli trials have these characteristics:
  • Each trial is independent,
  • Each outcome is determined only by chance, and
  • The probability of success is fixed.

Bernoulli trials have a Binomial distribution.

Bias
1. In argumentation, favoring one side over another, typically by making the burden of proof more difficult for one side. Also see extraordinary claims.
2. In statistics, results consistently some amount displaced from the theoretical or practical expectation. This is a clear indication that the measured process has not been completely understood or modeled.
3. In electronics, typically a static or DC voltage or current. A bias voltage is necessary to condition some forms of transducer (see, for example, Geiger-Mueller tube). However, the most common use is to keep some active device (typically a transistor) "partly on," so that it can amplify or respond to both positive and negative parts of an AC signal.

Transistor biasing is trickier than it might seem from knowing the simple purpose of keeping the device "partly on":

  • There will be a wide range of different input bias solutions to cover a wide range of no-signal output currents, which also implies a wide range of output resistor values.
  • Output bias is normally controlled by the same base or gate used for input signal, which to some extent compromises the distinction between the two different functions.
  • Transistors apparently cannot be manufactured to close specifications, so different devices, even of exactly the same type, may need significantly different input bias values to achieve the same output results.

One common biasing approach is to place a particular DC voltage on the base or gate, and a resistor in the emitter or source lead. Transistor action then tends to increase current until the emitter or source has a voltage related to the base or gate, a form of negative feedback. This sets the output bias current, which with a particular pull-up resistor sets a desired output voltage. One difficulty with this approach is that it demands an input signal with lower impedance than the biasing, so that the AC signal will dominate. Another issue is that the emitter or source resistor will use some of the available voltage simply to establish bias, voltage which then is not available across the device for AC signals. (Also see transistor self-bias.)

Bijection
Functions f: X -> Y and the inverse f -1: Y -> X, where, for all x in X:
   if f(x) = y then f-1(y) = x.
With a bijection an inverse always exists. (Contrast with: involution.)

A bijection on bit-strings, F: {0,1}n -> {0,1}n is one way to describe the emulated, huge, simple substitution of a conventional block cipher. In such a cipher, the encryption function is a key-selected permutation of all possible input or block values, which is the input alphabet. See: bijective and mapping.

Bijective
A bijection. A mapping f: X -> Y which is both injective (one-to-one) and surjective (onto). For each unique x in X there is corresponding unique y in Y. An invertible mapping. See: bijection, permutation and simple substitution.

Bijective Compression
Apparently, a form of data compression in which arbitrary values or arbitrary strings can be decompressed into seemingly grammatical text. More generally, the ability to decompress random blocks or strings into data with structure and statistics similar to that expected from plaintext. The phrase "Bijective Compression" should be taken as a name or term of art, and not a mathematical description.

Making random data decompress into language text (necessarily also random in some way) would seem to be difficult. Different classes of plaintext, such as language, database files, program code, or whatever, probably require different compressors or at least different compression models. With respect to language text, such a compressor should decompress random strings into spaced correct words or "word salad." That should complicate attempts to automatically distinguish the original message or block from among other possibilities.

Should bijective compression actually be possible and practical, the significance would be massive. Computerized attacks can succeed only if a correct deciphering can be recognized automatically. When incorrect decipherings have structure which is close to plaintext, a computer may not be able to distinguish them from success. If humans skill is needed to read and judge the result of thousands or millions of brute-force attempts, traversing a keyspace may take tens of millions of times longer than simple computer scanning. Making an attack millions of times harder than it was before could be the difference between complete practical security and almost no security at all. Holding an attack loop down to human reading speeds could produce a massive increase in practical strength.

Binary
From the Latin for "dual" or "pair."
1. Dominantly used to indicate "base 2": The numerical representation in which each digit has an alphabet of only two symbols: 0 and 1. This is just one particular coding or representation of a value which might otherwise be represented (with the exact same value) as octal (base 8), decimal (base 10), or hexadecimal (base 16). Also see bit and Boolean.
2. The confusing counterpart to unary when describing the arity or number of inputs or arguments to a function, but dyadic is almost certainly a better choice.

Binomial Distribution
Binomial literally means "two names." In statistics, the probability of finding exactly k successes in n independent Bernoulli trials, each of which has exactly two possible outcomes, when the "success" probability is p:
               n    k      n-k
   P(k,n,p) = ( )  p  (1-p)
               k

This ideal distribution is produced by evaluating the probability function for all possible k, from 0 to n.

If we have an experiment which we think should produce a binomial distribution, and then repeatedly and systematically find very improbable test values, we may choose to reject the null hypothesis that the experimental distribution is in fact binomial.

Also see the binomial section of my JavaScript page: "Binomial and Poisson Statistics Functions in JavaScript," locally, or @: http://www.ciphersbyritter.com/JAVASCRP/BINOMPOI.HTM#Binomial, and my early message on randomness testing (locally, or @: http://www.ciphersbyritter.com/NEWS2/94080601.HTM).

Bipolar
1. Having two polarities, as in + and - voltages (e.g., AC), and P and N semiconductor.
2. A junction transistor (NPN or PNP). As opposed to a field effect transistor.
3. Circuits or devices implemented with junction transistors.

Birthday Attack
A form of attack in which it is necessary to obtain two identical values from a large population. The "birthday" part is the realization that it is far easier to find an arbitrary matching pair than to match any particular value. Often a hash attack. Also see: population estimation, augmented repetitions, birthday paradox, and the conversation "Birthday Attack Calculations," locally, or @: http://www.ciphersbyritter.com/NEWS4/BIRTHDAY.HTM.

Birthday Paradox
The apparent paradox that, in a schoolroom of only 23 students, there is a 50 percent probability that at least two will have the same birthday. The "paradox" is that we have an even chance of success with just 23 of 365 possible days represented.

The "paradox" is resolved by noting that we have a 1/365 chance of success for each possible pairing of students, and there are 253 possible pairs or combinations of 23 things taken 2 at a time. (To count the number of pairs, we can choose any of the 23 students as part of the pair, then any of the 22 remaining students as the other part. But this counts each pair twice, so we have 23 * 22 / 2 = 253 different pairs.) Note that 253 / 365 = 0.693151.

This problem seems to beg confusion between probability and expected counts, since the correct expectation is often fractional. We can relate the probability of finding a "double" of some birthday (Pd) to the expected number of doubles (Ed) as approximately (equations (5.4) and (5.5) from my article):

   Pd = 1 - e-Ed  ,

so

   Ed = -Ln( 1 - Pd )  .
For a success probability of 0.5, the expected doubles are
   Ed = -Ln( 1 - 0.5 ) = 0.693147  .

One way to model the overall probability of success is from the probability of failure (1 - 1/365 = 0.99726) multiplied by itself for each pair that could have a possible match. For the birthday case this model gives the overall failure probability of 0.99726253 (0.99726 to the 253rd power) or 0.4995, for a success probability of 0.5005.

A different model addresses the probability of success for each sample, instead of each pair. For population (N) and samples (s) (equation (1.2) from my article):

  Pd(N,s) = 1 - (1-1/N)(1-2/N)..(1-(s-1)/N)  ,
which gives a success probability for 23 samples of 0.5073.

Sometimes the problem is to find the number of samples (s) needed for a given probability of success in finding doubles (Pd) from a given population (N). Starting with equation (2.5) from my article and substituting (5.5), we get:

   s(N,p) = (1 + SQRT(1 - 8N Ln( 1 - Pd ))) / 2  .
For the birthday case the number of samples needed from a population of 365 for an even chance of success is:
   s(365,0.5) = (1 + SQRT(1 - (8 * 365 * -0.693))) / 2
              = (1 + SQRT( 2024.56 )) / 2
              = 45.995 / 2
              = 22.997  .
This result means that 23 samples should meet with success just a little more often than the 1 time in 2 demanded by p = 0.5.

Also see: birthday attack, population estimation, augmented repetitions, my Cryptologia "birthday" article: "Estimating Population from Repetitions in Accumulated Random Samples," locally, or @: http://www.ciphersbyritter.com/ARTS/BIRTHDAY.HTM, and an example and comparison to various other methods in the conversation "Birthday Attack Calculations," locally, or @: http://www.ciphersbyritter.com/NEWS4/BIRTHDAY.HTM.

Bit
1. The smallest possible discrete unit of information. A Boolean value: True or False; Yes or No; one or zero; Set or Cleared. A contraction of "binary digit," apparently coined by J. W. Tukey. Virtually all information to be communicated or stored digitally is coded in some way which fundamentally relies on individual bits. Alphabetic characters are often stored in eight bits, which is a byte.
2. The real number average of information in bits per symbol as used in entropy, assuming the computation uses base-2 logs.

In digital electronics, bits generally are represented by voltage levels on connected wires, at a given time. When the bit-value on a wire changes, some time will elapse until the wire reaches the new voltage level. Until that happens, the wire voltage is not a valid digital level and should not be interpreted as having a particular bit value. Also see: logic level.

Bit Balance
The goal of achieving a balance, or equal numbers of 1's and 0's bits, in some vector or block. (Especially see Dynamic Transposition.)

There are various ways this might be achieved:

  • Use a block code in which every codeword is balanced. This will expand the message somewhat.
  • Use a code in which most codewords are balanced, with a state machine to count the unbalance and correct it as soon as possible (e.g., 8b10b). This also expands the message.
  • Use a scrambler or simple Vernam cipher to randomize and statistically balance the data. This does not expand the message, but also does not guarantee perfect balance, although blocks of reasonable size will be "nearly" balanced to high probability.
  • Use my approach as described in my article "Dynamic Transposition Revisited Again" (40K) (locally, or @: http://www.ciphersbyritter.com/ARTS/DYNTRAGN.HTM):

    "Exact bit-balance can be achieved by accumulating data to a block byte-by-byte, only as long as the block can be balanced by adding appropriate bits at the end."

    "We will always add at least one byte of 'balance data' at the end of the data, a byte which will contain both 1's and 0's. Subsequent balance bytes will be either all-1's or all-0's, except for trailing 'padding' bytes, of some balanced particular value. We can thus transparently remove the balance data by stepping from the end of the block, past any padding bytes, past any all-1's or all-0's bytes, and past the first byte containing both 1's and 0's. Padding is needed both to allow balance in special cases, and when the last of the data does not completely fill the last block."

    "This method has a minimum expansion of one byte per block, given perfectly balanced binary data. ASCII text may expand by as much as 1/3, which could be greatly reduced with a pre-processing data compression step."

Bit Permutation
A re-arrangement of the bits in a block of data. Also see bit permutation cipher.

Bit Permutation Cipher
A transposition cipher that re-arranges the positions of bits in the block of data being enciphered (bit permutation). For security even when the block is all-1's or all-0's, the data can be bit-balanced before transposition (see Dynamic Transposition).

(My article "A Keyed Shuffling System for Block Cipher Cryptography," illustrates key hashing, a nonlinearized RNG, and byte shuffling. We would do a similar thing for bit-permutation, but with a larger and wider RNG and shuffling bits instead of bytes. See either locally, or @: http://www.ciphersbyritter.com/KEYSHUF.HTM).

Ciphering by bit-transposition has unusual resistance to known plaintext attack because many, many different bit-permutations of the plaintext data will each produce exactly the same ciphertext result. Consequently, even knowing both the plaintext and the associated ciphertext does not reveal the shuffling sequence. Bit-permutation thus joins double-shuffling in hiding the shuffling sequence, which is important when we cannot guarantee the strength of that sequence (as we generally cannot).

A transposition cipher is "dynamic" when it "never" permutes two blocks in the same way. Dynamic bit-permutation ciphers can be a very competitive practical alternative to both stream ciphers and conventional block ciphers. Although bit-shuffling may be slower, it has a clearer and more believable source of strength than the other alternatives.

Also see my article: "Dynamic Transposition Revisited Again" (40K) (locally, or @: http://www.ciphersbyritter.com/ARTS/DYNTRAGN.HTM), the Dynamic Transposition Ciphering conversation (730K) (locally, or @: http://www.ciphersbyritter.com/NEWS5/REDYNTRN.HTM).

Bit Shuffling
Bit transposition by the shuffle algorithm, see bit permutation.

Bit Transposition
Transposition by bit, see bit permutation.

Black
In TEMPEST, electrical wiring or signals carrying ciphertext which thus can be exposed without danger.

Black Box
An engineering term for a component only specified with input and output values, with the internal implementation irrelevant. In this way, the real complexity of the component need not be considered in understanding a system using the component.

Digital logic IC's are wildly successful examples of hardware black box components. Externally, they perform useful digital functions, and in most cases, digital designers need not think about the internal construction. Internally, however, the "digital" devices use analog transistors to effect digital operation.

An example of black box software design is a subroutine or Structured Programming module, where all interaction with the caller is in the form of parameters. The module uses the given resources, does what it needs, completes, and returns to the caller. As long as the module does what we want, there is no need to know how the module works, so we can avoid dealing with internal complexity at the lower level. And when the module does not work, it can be debugged in a minimal environment which avoids most of the complexity of the larger system, thus making debugging far easier.

Block
1. A fixed amount of data treated as a single unit.
2. More than one element treated as a single unit.
As opposed to a sequence of elements, as in a stream.

In a discussion of block cipher concepts, cryptography implicitly uses definition (2), because it is the accumulation of multiple characters (and the resulting larger ciphering alphabet) which is characteristic of conventional block ciphers. A one-element "block" simply cannot exhibit the various block issues (such as mixing, diffusion, padding and expansion) that we see in a real block cipher, and so fails to model both the innovation and the resulting problems. Similar effects occur when any scalable model is simplified beyond reason. (See: scientific method.) It is also possible to cipher blocks of dynamically selectable size, or even fine-grained variable size.

All real block ciphers are in fact streamed to handle more than one block of data. The actual ciphering might be seen as a stream meta-ciphering using a block cipher transformation. The point of this is not to provide a convenient academic way to contradict any possible response to a question of "stream or block," but instead to identify the origin of various ciphering properties and problems (see: a cipher taxonomy).

It is not possible to block-cipher just a single bit or byte of a block. (When that is possible, we may be dealing with a stream cipher.) If individual bytes really must be block-ciphered, it will be necessary to fill out each block with padding in some way that allows the padding to be distinguished from the actual plaintext data after deciphering.

Partitioning an arbitrary stream of into fixed-size blocks generally means the ciphertext handling must support data expansion, if only by one block. But handling even minimal data expansion may be difficult in some systems.

The distinction between "block" and "stream" corresponds to the common distinction between "block" and "character" device drivers in operating systems. This is the need to accumulate multiple elements and/or pad to a full block before a single operation, versus the ability to operate without delay but requiring multiple operations. This is a common, practical distinction in data processing and data communications.

A competing interpretation of block versus stream operation seems to be based on transformation "re-use": In that interpretation, block ciphering is about having a complex transformation, which thus directly supports re-use (providing each plaintext block "never" re-occurs). In that same interpretation, stream ciphering is about supporting transformation re-use by changing the transformation itself. These effects do of course exist (although in my view they are not the most fundamental issues for analysis or design). But that interpretation also allows both qualities to exist simultaneously at the same level of design, and so does not provide the full analytical benefits of a true logical dichotomy.

Block Cipher
A cipher which requires the accumulation of multiple data characters (or bytes) in a block before ciphering can start. This implies a need for storage to hold the accumulation, and time for the accumulation to occur. It also implies a need to handle partly-filled blocks at the end of a message. In contrast, a stream cipher can cipher bytes immediately, as they occur, and as many or as few as are required. Block ciphers can be called "codebook-style" ciphers, and are typically constructed as product ciphers, thus showing a broad acceptance of multiple encryption. Also see variable size block cipher and a cipher taxonomy.

There is some background:

  • The exact definition of a block cipher is oddly controversial, see: block cipher definitions.
  • There is some confusion over appropriate block and stream scientific models, see block cipher and stream cipher models.
  • There is a disturbing insistence by some academics that only one model deserves the name "block cipher." That is a problem because other types of cipher do operate on data in blocks, yet do not follow that model. The problem is that general characteristics of "block ciphering" are unlikely to be resolved when only one type of cipher is allowed under that name. This Glossary does not recognize those limitations.

    Conventional Block Ciphers

    A conventional block cipher is a transformation between all possible plaintext block values and all possible ciphertext block values, and is thus an emulated simple substitution on huge block-wide values. Within a particular block size, both plaintext and ciphertext have the same set of possible values, and when the ciphertext values have the same ordering as the plaintext, ciphering is obviously ineffective. So effective ciphering depends upon re-arranging the ciphertext values from the plaintext ordering, and this is a permutation of the plaintext values. A conventional block cipher is keyed by constructing a particular permutation of ciphertext values for each key.

    The mathematical model of a conventional block cipher is bijection, and the set of all possible block values is the alphabet. In cryptography, the bijection model corresponds to an invertible table having a storage element associated with each possible alphabet value. Since each different table represents a different permutation of the alphabet, the number of possible tables is the factorial of the alphabet size.

    In particular, a conventional block cipher with a 64-bit block has an alphabet size of 264 (that is, 2**64) elements (about 18446744073709552000), and the potential number of keys is the factorial of that, a number which needs 1.15397859e+22 bits (about 10**22 bits or more than a million million million bits) for full representation. (For these and similar computations, try the "Base Conversion, Logs, Powers, Factorials, Permutations and Combinations in JavaScript" page locally, or @: http://www.ciphersbyritter.com/JAVASCRP/PERMCOMB.HTM) But, in practice, a popular block cipher like DES can only select from among 256 (that is, 2**56) different tables using 56 key bits. To calculate the ratio of two values expressed as exponents we subtract, so to find the proportion of tables we can actually use, we have 1.15x1022 - 56, (or 1.15*(10**22)), which does not even change the larger expressed value. Thus, DES and other conventional block ciphers generally support only an almost infinitesimal fraction of the keys possible under their own mathematical and cryptographic model. This is an inherent selection of a tiny subset of keys, which is a massive deviation from the model of balanced, flat or unbiased keying across all possibilities. Also see AES.

    Block Cipher Data Diffusion

    In an ideal conventional block cipher, changing even a single bit of the input block will change all bits of the ciphertext result, each with independent probability 0.5. This means that about half of the bits in the output will change for any different input block, even for differences of just one bit. This is overall diffusion and is present in a block cipher, but usually not in a stream cipher. Data diffusion is a simple consequence of the keyed invertible simple substitution nature of the ideal block cipher.

    Improper diffusion of data throughout a block cipher can have serious strength implications. One of the functions of data diffusion is to hide the different effects of different internal components. If these effects are not in fact hidden, it may be possible to attack each component separately, and break the whole cipher fairly easily.

    Partitioning Messages into Fixed Size Blocks

    A large message can be ciphered by partitioning the plaintext into blocks of a size which can be ciphered. This essentially creates a stream meta-cipher which repeatedly uses the same block cipher transformation. Of course, it is also possible to re-key the block cipher for each and every block ciphered, but this is usually expensive in terms of computation and normally unnecessary.

    A message of arbitrary size can always be partitioned into some number of whole blocks, with possibly some space remaining in the final block. Since partial blocks cannot be ciphered, some random padding can be introduced to fill out the last block, and this naturally expands the ciphertext. In this case it may also be necessary to introduce some sort of structure which will indicate the number of valid bytes in the last block.

    Block Partitioning without Expansion

    Proposals for using a block cipher supposedly without data expansion may involve creating a tiny stream cipher for the last block. One scheme is to re-encipher the ciphertext of the preceding block, and use the result as the confusion sequence. Of course, the cipher designer still needs to address the situation of files which are so short that they have no preceding block. Because the one-block version is in fact a stream cipher, we must be very careful to never re-use a confusion sequence. But when we only have one block, there is no prior block to change as a result of the data. In this case, ciphering several very short files could expose those files quickly. Furthermore, it is dangerous to encipher a CRC value in such a block, because exclusive-OR enciphering is transparent to the field of mod 2 polynomials in which the CRC operates. Doing this could allow an opponent to adjust the message CRC in a known way, thus avoiding authentication exposure.

    Another proposal for eliminating data expansion consists of ciphering blocks until the last short block, then re-positioning the ciphering window to end at the last of the data, thus re-ciphering part of the prior block. This is a form of chaining and establishes a sequentiality requirement which requires that the last block be deciphered before the next-to-the-last block. Or we can make enciphering inconvenient and deciphering easy, but one way will be a problem. And this approach cannot handle very short messages: its minimum size is one block. Yet any general-purpose ciphering routine will encounter short messages. Even worse, if we have a short message, we still need to somehow indicate the correct length of the message, and this must expand the message, as we saw before. Thus, overall, this seems a somewhat dubious technique.

    On the other hand, it does show a way to chain blocks for authentication in a large-block cipher: We start out by enciphering the data in the first block. Then we position the next ciphering to start inside the ciphertext of the previous block. Of course this would mean that we would have to decipher the message in reverse order, but it would also propagate any ciphertext changes through the end of the message. So if we add an authentication field at the end of the message (a keyed value known on both ends), and that value is recovered upon deciphering (this will be the first block deciphered) we can authenticate the whole message. But we still need to handle the last block padding problem and possibly also the short message problem.

    Block Size and Plaintext Randomization

    Ciphering raw plaintext data can be dangerous when the cipher has a relatively small block size. Language plaintext has a strong, biased distribution of symbols and ciphering raw plaintext would effectively reduce the number of possible plaintext blocks. Worse, some plaintexts would be vastly more probable than others, and if some known plaintext were available, the most-frequent blocks might already be known. In this way, small blocks can be vulnerable to classic codebook attacks which build up the ciphertext equivalents for many of the plaintext phrases. This sort of attack confronts a particular block size, and for these attacks Triple-DES is no stronger than simple DES, because they both have the same block size.

    The usual way of avoiding these problems is to randomize the plaintext block with an operating mode such as CBC. This can ensure that the plaintext data which is actually ciphered is evenly distributed across all possible block values. However, this also requires an IV which thus expands the ciphertext.

    Worse, a block scrambling or randomization function like CBC is public, not private. It is easily reversed to check overall language statistics and thus distinguish the tiny fraction of brute force results which produce potentially valid plaintext blocks. This directly supports brute force attack, as well as any attack in which brute force is a final part. One alternative is to use a preliminary cipher to randomize the data instead of an exposed function. Pre-ciphering prevents easy plaintext discrimination; this is multiple ciphering, leading in the direction Shannon's Ideal Secrecy.

    Another approach (to using the full block data space) is to apply data compression to the plaintext before enciphering. If this is to be used instead of plaintext randomization, the designer must be very careful that the data compression does not contain regular features which could be exploited by the opponents.

    An alternate approach is to use blocks of sufficient size for them to be expected to have a substantial amount of uniqueness or entropy. If we expect plaintext to have about one bit of entropy per byte of text, we might want a block size of at least 64 bytes before we stop worrying about an uneven distribution of plaintext blocks. This is now a practical block size.

    Block Cipher Defintions
    As far as we know, the original automated ciphers were stream ciphers developed from the Vernam work with teleprinter encryption, as patented in 1919. Block terminology seems to have come along much later, to distinguish fundamentally different designs from the old, well-known streams. This distinction occurred long before modern open cryptographic analysis. Distinguishing "block" from "stream" in the present day is important because it is useful: a true dichotomy is the friend both of the analyst and student. (Also see a cipher taxonomy.)

    It may be helpful to recall a range of published distinctions between "stream cipher" and "block cipher" (and if anyone has any earlier references, please send them along). Note that open discussion was notably muted during the Cold War, especially during the 50's, 60's and 70's. I see the earlier definitions as attempts at describing an existing codification of knowledge, which was at the time tightly held but nevertheless still well-developed.

    Some Early Definitions

    • 1976. "Much as error correcting codes are divided into convolutional and block codes, cryptographic systems can be divided into two broad classes: stream ciphers and block ciphers. Stream ciphers process the plaintext in small chunks (bits or characters), usually producing a pseudo-random sequence of bits which is added modulo 2 to the bits of the plaintext. Block ciphers act in a purely combinatorial fashion on large blocks of text, in such a way that a small change in the input block produces a major change in the resulting output." (p. 646) [Unfortunately, these two different definitions establish four classes, not just the two classes suggested by the first sentence. For example, what do we call a cipher which acts on "blocks of text" but not in a "purely combinatorial fashion"? Having two different distinctions for only two classes is an error that we need to recognize and get beyond. /tfr]. -- Diffie, W. and M. Hellman. "New Directions in Cryptography." IEEE Transactions on Information Theory. Vol. IT-22, No. 6, November 1976.

    • 1979. "Transposition is a block cipher which operates on words (blocks) of length n, employing as keys the n! positional permutations to transpose the letters of a word." (p. 290) -- Lempel, A. "Cryptology in Transition." ACM Computing Surveys. Vol. 11, No. 4, Dec. 1979.

    • 1979. "A primitive distinction among cryptosystems is the structural classification into stream and block ciphers." "A stream cipher operates on the plaintext symbol by symbol to produce a sequence of cipher symbols . . . ." "In a block cipher a block of symbols from A is operated on jointly by the encryption algorithm . . . ." (p. 306) -- Simmons, G. "Symmetric and Asymmetric Encryption." ACM Computing Surveys. Vol. 11, No. 4, Dec. 1979.

    • 1979. "Block ciphers divide the plaintext into blocks, usually of a fixed size, and operate on each block independently." (p. 415) "Stream ciphers, in contrast, do not treat the incoming characters independently." (p. 415) [Note that blocking and independence are different concepts, thus introducing the confusion of exactly which part of the definition to follow. /tfr] -- Diffie, w. and M. Hellman. "Privacy and Authentication: An Introduction to Cryptography." Proceedings of the IEEE. Vol. 67, No. 3, March 1979.

    • 1982. "A block cipher transforms a string of input bits of fixed length (an input block) into a string of output bits of fixed length (an output block)." (p. 23) "A stream cipher employs a bit-stream generator to produce a stream of binary digits . . . which is then combined either with plaintext . . . to produce ciphertext, or with ciphertext . . . to recover plaintext." (p. 53) -- Meyer, C. and S. Matyas. Cryptography: A New Dimension in Data Security.

    Some Current Definitions

    • 1996. "Symmetric algorithms can be divided into two categories. Some operate on the plaintext a single bit (or sometimes byte) at a time; these are called stream algorithms or stream ciphers. Others operate on the plaintext in groups of bits. The groups of bits are called blocks, and the algorithms are called block algorithms or block ciphers." (p. 4) -- Schneier, B. Applied Cryptography.

    • 1997. "A block cipher is a function which maps n-bit plaintext blocks to n-bit ciphertext blocks; n is called the blocklength. It may be viewed as a simple substitution cipher with large character size." (p. 224) [But this definition excludes the case of a cipher with an output block larger than the input block, so what would such a cipher be called? /tfr] "Stream ciphers . . . are, in one sense, very simple block ciphers having block length equal to one." "They also can be used when the data must be processed one symbol at a time (e.g., if the equipment has no memory, or buffering of data is limited)." (p. 20) --Menezes, A., van Oorschot, P. and Vanstone, S. Handbook of Applied Cryptography.

    The intent of classification is understanding and use. Accordingly, it is up to the analyst or student to "see" a cipher in the appropriate context, and it is often useful to consider a cipher to be a hierarchy of ciphering techniques. For example, it is extremely rare for a block cipher to encipher exactly one block. But when that same cipher is re-used again that seems a lot like repeated substitution, which is the basis for stream ciphering. (Of course repeatedly using the same small substitution would be ineffective, but if we attempt to classify ciphers by their effectiveness, we start out assuming what we are trying to understand or prove.) So an alternate way to "see" the re-use of a block cipher is as a higher-level stream "meta-cipher" which uses a block cipher component. But that is exactly what we call "block ciphering."

    Alternate Distinctions

    Some academics insist upon distinguishing stream versus block ciphering by saying that block ciphers have no retained state between blocks, while stream ciphers do. Simply saying that, however, does not make it true, and only one example is needed to expose the distinction as false and misleading. A good example for that is my own Dynamic Transposition cipher, which is a block cipher in that it requires a full block of data before processing can begin, yet also retains state between blocks. So if DT is not a block cipher, what is it? We would hope to define only two categories, not four or more. Note that Lempel (1979, above) explicitly says that transposition is a block cipher. Again, see: a cipher taxonomy to see one approach on how ciphers relate.

    Blocks by Implementation

    Another issue is that stream ciphers can be implemented in ways that accumulate a block of data before ciphering. Internally, such systems generally have a streaming system which traverse the block element-by-element, perhaps multiple times. It is important to see beyond an apparent block requirement stemming from data manipulation only, which thus contributes no strength, to the internal ciphers which (hopefully) do provide strength.

    It is also possible to have multiple stream ciphers work on the same "block," and then we do have a legitimate "block cipher" (or perhaps a "block meta-cipher") formed by multiple encryption of stream ciphers. (Although multiple ciphering with additive stream ciphers is usually unhelpful, most conventional block ciphers are in fact multiple encryptions internally, so internal multiple ciphering is hardly a crazy approach.) But if we want to understand strength, we still need to consider the fundamental ciphering operations which, here, are streams. Simply making something work like a block cipher does not give it the same model as a conventional block cipher, and so does not provide for analysis at that level. In the end, we might see such a construction as a block meta-cipher composed of internal stream ciphers.

    Block Cipher and Stream Cipher Models
    In the study of technology, it is often important to create a scientific model which describes observed reality. In cryptography, things are more fluid than in the physical sciences, because the reality of ciphering is itself is a construction, and there are many such constructions. Consequently, it is not always obvious what model delivers the best insight. In fact, different models may provide different insights on exactly the same reality.

    Currently, there are three main block cipher models:

    • A transformation which is a keyed substitution.
    • The transformation which has no retained state.
    • The simple collection of multiple elements.
    Each of these models has widely different implications, advantages and problems.

    The Block as a Bijection

    The common academic model of a block cipher is the mathematical bijection, which cryptography calls simple substitution. In practice, such a cipher requires a table far too large to instantiate, and so the actual cipher only emulates a huge, keyed table.

    One advantage of the bijection model is that specific, measurable mathematical things can be said about a bijection. Of course exactly the same things also can be said about simple substitution, and the field of ciphering is cryptography, not mathematics.

    One problem with the bijection model is that it does not attempt to establish a dichotomy. In the bijection model, "block cipher" just another label in a presumably endless sequence of such labels, each representing a distinct ciphering approach. Consequently, the bijection model makes a poor contribution toward an overall cipher taxonomy useful in the analysis of arbitrary cipher designs.

    Another problem with the bijection model is that it establishes yet another term of art: The word block is well known, understood, and rarely disputed. The word cipher is also widely agreed upon. The phrase "block cipher" obviously includes nothing about bijections. So to define "block cipher" in terms of bijections is to take the phrase far beyond the simple meaning of the terms. We could scarcely describe this as anything other than misleading.

    Yet another problem with the bijection model, is that, since it presumes to define "block cipher" as a particular type of cipher, what are we to do with ciphers which operate on blocks and yet do not function as bijections (e.g., transposition cipher)? No longer are ciphers related by their proper description. This is even more misleading.

    Ultimately, the problem with the bijection model is not the model itself: The model is what it is because substitution is what it is. The problem is the insistence by some academics that this is the only valid model for a "block cipher." A much better choice for the bijection model is the phrase: "conventional block cipher."

    The Block as Static State

    The static state model puts forth the proposition that stream ciphers dynamically change their internal state, whereas block ciphers do not. Typically, there is also an understanding that the bijective block cipher model applies.

    One problem with the static state definition is again in the name itself: The phrase "block cipher" does not include the word "state." To use the phrase "block cipher" for a property of state is to create yet another term of art, preempting the obvious meaning of the phrase "block cipher," and preventing related block-like ciphers from having similar descriptions, thus misleading both instructor and student.

    Another problem with the static state model is that we can build stream-like ciphers which do not change their internal state (in fact, I claim we stream a substitution table when we repeatedly use it across a message, just like we stream DES). Similarly, we can build block-like ciphers which do change their internal state (I usually offer my Dynamic Transposition cipher as an example, but so is a block cipher built from multiple internal stream operations). So if we accept the static state model, what do we call those ciphers which function on blocks, and yet do change state? Why preempt the well-known terms "block" or "stream" for the fundamentally different properties of internal state?

    Ultimately, what insight does state classification provide that warrants usurping the obvious descriptive phrases "block cipher" and "stream cipher" instead of thinking up something appropriate?

    The Block as Multiple Elements

    The original mechanized ciphers were stream ciphers, starting with the Vernam cipher of 1919. The term "block cipher" may have been introduced in the secret world of government security to draw a practical distinction between the well-known stream concept, and the newer designs that operated on a block. (That would have been in the 50's, 60's or even 70's; hopefully, someone will either confirm this or correct it.) In the multiple-element model, a block cipher requires the accumulation of more than one data element before ciphering can begin.

    One advantage of the multiple-element definition is that it forms an easy dichotomy with the definition of a stream cipher as a cipher which does not require such accumulation. Also note that this is no mere semantic issue, but is instead just one representation of a broader concept of "one versus many" which rises repeatedly in computing practice, including:

    • "block" versus "character" device drivers in an OS, and
    • "parallel" versus "serial" buses in digital electronics.

    The various consequences of the single-element versus multiple-element dichotomy are well known: When blocks are accumulated from individual elements, storage is required for that accumulation, and time is required as well, which can imply latency. In contrast, when elements need not be accumulated, there need be neither storage nor latency, but the total overhead may be greater. While latency probably is not much of an issue for email ciphering, latency can be significant for real-time streams like music or video, or interactive handshake protocols. Overhead is, of course, a significant issue in system design.

    To see how the multiple-element block cipher definition works, consider the following:

    • What is a transposition cipher? It is one form of a block cipher.
    • What is a cipher which repeatedly runs several different internal stream-like ciphers across a data block? That is also one form of a block cipher.
    • What is simple substitution? Also one form of a block cipher. Although not the form of a block cipher, it is the "conventional block cipher."

    Strangely, a degenerate block is exactly the same as a degenerate sequence: just one element. In neither case does that element teach about the larger object: a one-element block does not have diffusion between elements, and a one-element stream does not have correlation between elements. (Similarly, is a single electronic wire with a fixed voltage one-wire "parallel" or one-value "serial"?) From this we conclude that the most important aspects of cryptographic (and electronic) design and analysis simply do not exist as a single element, so it is inappropriate to either use or judge a model at that level.

    Block Code
    In coding theory, given alphabet A, and length or block size n, the set of all sequences or codewords which consist of n selections from among set A, denoted {A}n.

    A block size of n bits typically means that 2n different codewords or "block values" can occur. An (n,k) block code uses those 2n codewords to represent the equal or smaller count of 2k different messages. Thus, a 64-bit block cipher normally encodes 64 plaintext bits into 64 ciphertext bits as a simple (64,64) code. But if 16 input bits are reserved for other use, the coding expands 48 plaintext bits into 64 ciphertext bits, so we have a (64,48) code.

    The normal use for extra codewords is to implement some form of error detection and/or error correction. This overhead is not normally called "inefficient coding," but is instead a simple cost of providing improved quality. In cryptography, the extra code words may be used to add security or improve performance by implementing:

    Also see 8b10b and huge block cipher advantages.

    Block Size
    The amount of data in a block. For example, the size of the DES block is 64 bits or 8 bytes or 8 octets.

    Blum, Blum and Shub
    BB&S.

    Boolean
    TRUE or FALSE; one bit of information.

    Boolean Algebra
    The logic of digital electronics. The algebraic ring formed by the set {0,1} with operations:
       NOT as complementation, indicated by ' (single quote)
           0' = 1
           1' = 0
       OR as addition, denoted "+"
           0 + 0 = 0
           0 + 1 = 1
           1 + 0 = 1
           1 + 1 = 1
       AND as multiplication, denoted "*" as usual
           0 * 0 = 0
           0 * 1 = 0
           1 * 0 = 0
           1 * 1 = 1
       XOR a useful but not essential operation
           0 XOR 0 = 0
           0 XOR 1 = 1
           1 XOR 0 = 1
           1 XOR 1 = 0
    
       1. Addition is Commutative:         x + y = y + x
       2. Addition is Associative:         x + (y + z) = (x + y) + z
       3. The Additive Identity:           0 + x = x
       4. The Additive Inverse:            x + x' = 1
       5. Multiplication is Associative:   x(yz) = (xy)z
       6. Multiplication is Distributive:  x(y + z) = xy + xz
       7. Multiplication is Commutative:   xy = yx
       8. The Multiplicative Identity:     1 * x = x
    
       Other:
                1 + x = x
                x + x = x
                x * x = x
               (x')' = x
    
       DeMorgan's Laws:
               (x + y)' = x'y'
               (xy)' = x' + y'
    
       XOR:
                x XOR y = xy' + x'y
               (x XOR y)' = xy + x'y'
    

    Boolean Function
    A function which produces a Boolean result. The individual output bits of an S-box can each be considered to be separate Boolean functions. Also see logic function.

    Boolean Function Nonlinearity
    The number of bits which must change in the truth table of a Boolean function to reach the closest affine Boolean function.

    Typically computed as the fast Walsh-Hadamard transform (FWT) of the function being measured. For more details, see the topic unexpected distance and the "Active Boolean Function Nonlinearity Measurement in JavaScript" page (locally, or @: http://www.ciphersbyritter.com/JAVASCRP/NONLMEAS.HTM).

    Note that the FWT computation is done for efficiency only. It is wholly practical to compute the nonlinearity of short sequences by hand. It is only necessary to manually compare each bit of the measured sequence to each bit of an affine Boolean function. That gives us the distance from that particular function, and we repeat that process for every possible affine Boolean function of the measurement length.

    Especially useful in S-box analysis, where the nonlinearity for the table is often taken to be the minimum of the nonlinearity values computed for each output bit.

    Also see my articles:

    Boolean Logic
    The logic which applies to variables which have only two possible values. Also the digital hardware devices which realize such logic, and are used to implement a electronic digital computers. Also see Boolean algebra.

    Boolean Mapping
    A mapping of some number n Boolean variables into some number m Boolean results. For example, an S-box.

    Braid
    The stream cipher formed by intermingling or multiplexing two or more streams of data into a single stream of ciphertext. For encryption, presumably some form of keyed RNG would select which plaintext stream would contribute the next bit or byte to the ciphertext stream. For decryption, a similar keyed RNG would demultiplex or allocate each ciphertext bit or byte to the appropriate plaintext stream. If the different streams are similar, from the ciphertext it should be difficult to distinguish which bits or bytes belong to which plaintext stream. Accordingly, the scheme may work best as a super encryption or a meta-cipher handling already encrypted and thus very similar random-like streams. See: "Simon's Braided Stream Cipher" (locally, or @: http://www.ciphersbyritter.com/BRAID/BRAID.HTM). Also see ciphertext expansion, transposition and null.

    Branch Number
    Measures of block mixing: The minimum number of nonzero elements (e.g., bytes) in both the input and output, over all possible inputs. Or the minimum number of changing elements in both the input and output, over all possible input changes. When that mixing is applied between layers of S-boxes, this can be the minimum number of changing or active S-boxes.

    The Technical Definitions

    The original definition is for linear mapping theta,

    "The minimum total Hamming weight [wh] of (a,theta(a)) is a measure for the minimum amount of diffusion that is realized by a linear mapping."

    Definition 6.10 The branch number B of a linear mapping theta is given by

       B(theta) = min(wh(a),wh(theta(a))), for a<>0
    

    -- Daemen, J. 1995. Cipher and Hash Function Design, Strategies Based on Linear and Differential Cryptanalysis. Thesis. Section 6.8.1.

    Branch number specifically applies only to a linear mixing. Actually, even that is not quite right: the real problem is keying, not nonlinearity (although in practice, keying may imply nonlinearity). To the extent that we can experimentally traverse the input block, a branch number certainly can be developed for a nonlinear mixer.

    But while any particular mixer can have a branch number, a keyed mixer will have a branch number for every possible key. Moreover, we would expect the minimum over all those nonlinear mixings to be very low, just like the minimum strength of any cipher over all possible keys (the opponent trying just one key) is also very low. Yet we do not attempt to characterize ciphers by their minimum strength over all possible keys.

    No keyed structure can be properly characterized by the extrema over all keys. When we have random variables such as keying, we should be thinking of the distribution of values, and the probability of encountering extreme values. And that is not branch number.

    More insight is available in the description of the SQUARE cipher:

    "It is intuitively clear that both linear and differential trails would beneft from a multiplication polynomial that could limit the number of nonzero terms in input and output difference (and selection) polynomials. This is exactly what we want to avoid by choosing a polynomial with a high diffusion power, expressed by the so-called branch number.

    Let wh(a) denote the Hamming weight of a vector, i.e., the number of nonzero components in that vector." [Normally, Hamming weight applies to bits, but here it is being used for bytes./tfr] "Applied to a state a, a difference pattern a' or a selection pattern u, this corresponds to the number of non-zero bytes. In [2] the branch number B of an invertible linear mapping was introduced as

       B(theta) = for a<>0, min wh(a)+ wh(a)
    
    This implies that the sum of the Hamming weights of a pair of input and output difference patterns (or selection patterns) to theta is at least B. It can easily be shown that B is a lower bound for the number of active S-boxes in two consecutive rounds of a linear or differential trail."

    "In [15] it was shown how a linear mapping over GF(2m)n with optimal B (B = n + 1) can be constructed from a maximum distance separable code."

    -- Daemen, J., L. Knudsen and V. Rijmen." 1997. "The Block Cipher {SQUARE}." Fast Software Encryption, Lecture Notes in Computer Science Vol. 1267:149-165. Section 4.

    Measurement of Linear Mixing

    In the wide trail strategy, branch number applies to a particular unkeyed and linear diffusion mechanism. In the SQUARE design, branch number also applies to a particular unkeyed and linear polynomial multiplication. So branch number might also describe the simple linear form of Balanced Block Mixing used in Mixing Ciphers. But linear BBM's apparently do not have an optimal branch number (over all possible input data changes), although in most cases they do have a good branch, and are dynamically scalable to both tiny and huge blocks on a block-by-block basis.

    Nonlinear Mixing

    Instead of linear diffusion, it should be "intuitively obvious" that nonlinear diffusion would be a better choice for a cipher, if such could be obtained with good quality at reasonable cost. Nonlinear Balanced Block Mixing occurs when the butterfly functions are keyed. Keying is easily accomplished by constructing appropriate orthogonal Latin squares using the fast checkerboard construction. But "branch number" does not apply to these keyed nonlinear constructions.

    The Optimal Branch Value

    The "optimal" branch value for the MDS codes in the SQUARE design is given as n + 1. The branch number is basically the minimum number of input and output elements which are guaranteed to change, and there are 2n such elements. But when we try all possible cases of changed input data across a block, somewhere in there we ourselves create various worst-case inputs with only one element changed. In those cases, even if all n outputs change, we only get a branch of n + 1, so that is the most possible.

    Break
    1. To destroy an item, or offend and thus destroy an agreement.
    2. In cryptography, an attack which destroys the advantage of a cipher in hiding information. We would call such an attack "successful."

    From the Handbook of Applied Cryptography:

    "1.23 Definition. An encryption scheme is said to be breakable if a third party, without prior knowledge of the key pair (e,d), can systematically recover plaintext from corresponding ciphertext in some appropriate time frame." [p.14]
    "Breaking an information security service (which often involves more than simply encryption) implies defeating the objective of the intended service." [p.15]

    The term "break" seems to be a term of art in academic cryptanalysis, where it apparently means a successful attack which takes less effort than brute force (or the cipher design strength, if that is less), even if the effort required is impractical, and even if the attack is easily prevented at the cipher system level. This meaning of the term "break" can be seriously misleading because, in English, "break" means "to render unusable" or "to destroy," and not just "to make a little more dubious."

    The academic meaning of "break" is also controversial, as it can be used as a slander to demean both cipher and designer without a clear analysis of whether the attack really succeeds. And even if the attack does succeed, the question is whether it actually reveals data or key material, thus making the cipher dangerous for use in practice.

    Everyone understands that a cipher is "broken" when the information in a message can be extracted without the key, or when the key itself can be recovered, with less effort than the design strength. And a break is particularly significant when the work involved need not be repeated on every message. But when the amount of work involved is impractical, the situation is best described as a theoretical or academic break. The concept of an "academic break" is especially an issue for ciphers with a very large keyspace, in which case it is perfectly possible for a cipher with an academic break to be more secure than ciphers with lesser goals which have no "break." It is also at least conceivable that an attack can be surprising and insightful and, thus, "successful" even if it takes more effort than the design strength, which would be no form of "break" at all.

    In my view, a documented flaw in a cipher, such as some statistic which distinguishes a practical cipher from some model, but without an attack which recovers data or key, at most should be described as a "theoretical" or "certificational" weakness. Unfortunately, even a problem which has no impact on security is often promoted (improperly, in my view) to the term academic break or even "break" itself.

    Brute Force Attack
    A form of attack in which each possibility is tried until success is obtained. Typically, a ciphertext is deciphered under different keys until plaintext is recognized. On average, this should take about half as many decipherings as there are keys. Of course, the possibility exists that the correct key might be chosen first.

    Even when the key length of a cipher is sufficient to prevent brute force attack, that key will be far too small to produce every possible plaintext from a given ciphertext (see Shannon's Perfect Secrecy). Combined with the fact that language is redundant, this means that very few of the decipherings will be words in proper form. So most wrong keys could be identified immediately.

    On the other hand, recognizing plaintext may not be easy. If the plaintext itself -- and all known structure in the plaintext -- could be hidden, even a brute force attack on the keys could not succeed, for the correct deciphering could not be recognized when it occurred. If the plaintext was not language, but computer code, compressed text, or even ciphertext from another cipher, recognizing a correct deciphering could be difficult. It seems odd that more systems do not seek to leverage this advantage. See: known plaintext, Shannon's Ideal Secrecy and multiple encryption.

    Brute force is the obvious way to attack a cipher, and the way most ciphers can be attacked, so ciphers are designed to have a large enough keyspace to make this much too expensive to succeed in practice. Normally, the design strength of a cipher is based on the cost of a brute-force attack.

    Bug
    Technical slang for "error in design or implementation." An unexpected system flaw. Debugging is a normal part of system development and interactive system design.

    Burden of Proof
    A legal term of art basically meaning that those who put forth an argument based on certain facts have the burden of supplying proof that the facts are as stated. Also see: extraordinary claims.

    Butterfly
    A pair of dyadic functions used as the fundamental operation for "in place" forms of FFT. The name comes from a common graphic depiction of FFT operation which has the shape of a standing "hourglass," or butterfly wings on edge. Also see fast Walsh transform.

    In most FFT diagrams, the input elements are shown in a vertical column at the left, and the result elements in a vertical column on the right. Lines represent signal flow from left to right. There are two computations, and each requires input from each of the two selected elements. In an "in place" FFT, the results conveniently go back into the same positions as the input elements. So we have two horizontal lines between the same elements, and two diagonal lines going to each "other" element, which cross. This is the "hourglass" shape or "butterfly wings" on edge.

    Bypass
    In electronics, typically a capacitor from a power lead to circuit ground, thus "bypassing" noise to ground instead of distributing noise to other circuitry. Also see decoupling and amplifier.

    A source of stable power is the most important requirement for any electronic device. In particular, digital logic functions can only be trusted to produce correct results if their power is kept within specified limits. It is up to the designer to provide sufficient correct power and guarantee that it remain within limits despite whatever else is going on.

    Most digital logic families use "totem pole" outputs, which means they have a transistor from Vcc or Vdd (power) to the output pin, and another transistor from the output pin to Vss (ground). Normally, only one transistor is ON, but as the output signal passes from one state to another, transiently, both transistors can be ON, leading to short, high-current pulses on both the Vcc and ground rails. These current pulses are essentially RF energy, and can and do produce ringing on power lines and a general increase in system noise. The pulses are also strong enough to potentially change both the Vcc and ground voltage levels in the power distribution system near the device, which can affect nearby logic and operation. Typically this occurs at some random moment when the worst conditions coincide to cause a logic fault. To avoid that, we want to bypass the current pulse away from the power system in general, so other devices are not affected.

    For many years, a typical rule of thumb was to use a 0.1uF ceramic disc for each supply at each bipolar chip, plus a 1uF tantalum for every 8 chips. That may still be a good formula for slower analog chips and older digital logic like LSTTL. But as chip speed has increased, bypassing has become more complex.

    Ideally, a bypass capacitor will be connected from every supply pin to the ground pin right at each chip. Ideally, there will be no lead left on either end of the capacitor: not 1/4 inch, not 1/8 inch, which is one reason why surface-mount capacitors are desirable. Ideally, any necessary lead will be wide, flat copper. But the ideal system is a goal, not reality.

    One of the effects of higher system speeds is that normal system operation now covers the resonant frequency of the bypass capacitors. Unfortunately, this resonance is not a fixed constant, even for a particular type of part. Bypass resonance is instead a circuit condition, involving the reactance of the closest bypass capacitor, plus the inductance in power connections, and reactance in other bypass capacitors. Although it is virtually impossible to remove inductance from PC-board traces, it is possible to use whole copper layers as "power planes" for power distribution.

    Resonance means that an impulse causes "ringing," in which energy is propagated back and forth between inductance and capacitance until it finally dissipates in circuit resistance or is radiated away, but the resulting signal from many devices may appear as increased system noise.

    Resonance would actually seem to be the ideal bypass situation, in that a resonant bypass presents the minimum impedance to ground. But it does that only at one frequency; lower and higher frequencies are less rejected. It seems quite impractical to tune for resonance with the "random" pulses occurring in complex logic. And, above resonance, inductance dominates and then higher-frequency noise and pulses are more able to affect the rest of the system.

    Another approach has been to use various bypass capacitors, typically 0.01uF and 0.1uF in parallel, "sprinkled around" the PC layout. The idea was that self-resonance in any one bypass capacitor would be hidden by the other capacitor of different value and, thus, different resonant frequency. Alone, either a 0.01uF or a 0.1uF cap may do an effective job. However, recent modeling indicated, and experimentation has confirmed, that using both together can be substantially worse than using either value alone.

    The inherent limitation in bypassing is that the normal bypass process is not "lossy" or dissipative. Pulse energy can be stored in the inductance of short leads or PC-board traces, and then "ring" in resonance with the usual ceramic bypass capacitors. Having many bypass caps often leads to complex RF filter-like structures which just pass the ringing energy around. An alternative is the wide use of tantalum bypass capacitors, since tantalum becomes increasingly lossy at higher frequencies and will dissipate pulse energy.

    Several approaches seem reasonable:

    • Use one value and type of small bypass capacitor throughout.
    • Use more tantalum capacitors, which have a surprisingly good high frequency bypass capability as well as considerable power storage.
    • Use ferrite beads to isolate individual chips or small groups of chips from the shared power distribution. This would introduce a few ohms of resistance in distribution leads at high frequencies only. Resonance effects might still exist, but would be simplified, localized and isolated from the rest of the system. Related isolationist approaches have a long history in radio technology, where it is commonly called decoupling.

    Byte
    A collection of eight bits. Also called an "octet." A byte can represent 256 different values or symbols. The common 7-bit ASCII codes used to represent characters in computer use generally are stored in a byte; that is, one byte per character.

  • C
    In math notation, the complex numbers.

    CA
    Certification authority.

    Capacitor
    A basic electronic component which acts as a reservoir for electrical power in the form of voltage. A capacitor acts to "even out" the voltage across its terminals, and to "conduct" voltage changes from one terminal to the other. A capacitor "blocks" DC and conducts AC in proportion to frequency. Capacitance is measured in Farads: A current of 1 Amp into a capacitance of 1 Farad produces a voltage change of 1 Volt per second across the capacitor.

    If we know the capacitance C in Farads and the frequency f in Hertz, the capacitive reactance XC in Ohms is:

       XC = 1 / (2 Pi f C)
       Pi = 3.14159...
    
    Capacitors in parallel are additive. Two capacitors in series have a total capacitance which is the product of the capacitances divided by their sum.

    A capacitor is typically two conductive "plates" or metal foils separated by a thin insulator or dielectric, such as air, paper, or ceramic. An electron charge on one plate attracts the opposite charge on the other plate, thus "storing" charge. A capacitor can be used to collect a small current over long time, and then release a high current in a short pulse, as used in a camera strobe or "flash."

    The simple physical model of a component which is a simple capacitance and nothing else works well at low frequencies and moderate impedances. But at RF frequencies and modern digital rates, there is no "pure" capacitance. Instead, each capacitance has a series inductance that often does affect the larger circuit. See bypass.

    Also see inductor and resistor.

    Cardinal
    A count; the number of items in a group. Also see: ordinal, alphabet, population, and universe.

    Cartesian Product
    The Cartesian product A x B is the set of all ordered pairs (a,b) where the first component is taken from set A, and the second component is taken from set B.

    Cascade
    The general concept of a sequence of stages: first one, then another, then another. In electronics, a sequence of operations or components. This usage long precedes the current use in cryptography of cascade ciphering.

    Cascade Ciphering
    Generally speaking, multiple encryption, product ciphering or multiple ciphering. (Also see cascade.) Some sources define cascade ciphering as implying the use of statistically independent keys, whereas product ciphers may use keys which are not independent. Surprisingly, neither usage is consistent with the original definitions:

    The earliest definition of "cascade cipher" I know (1983) does not mention key independence:

    "A Cascade Cipher (CC) is defined as a concatenation of block cipher systems, thereafter referred to as its stages; the plaintext of the CC plays the role of the plaintext of the first stage, the ciphertext of the i-th stage is the plaintext of the (i+1)-st stage and the ciphertext of the last stage is the ciphertext of the CC.

    "We assume that the plaintext and ciphertext of each stage consists of m bits, the key of each stage consists of k bits and there are t stages in the cascade."

    [Note the lack of the term "independent."/tfr]

    -- Even, S. and O. Goldreich. 1983. "On the power of cascade ciphers." Advances in Cryptology: Proceedings of Crypto '83. 43-50.

    A modern academic definition is:
    "[The] Product of several ciphers is also a product cipher, such a design is sometimes called a cascade cipher."

    [Note the lack of anything like "with independent keys."/tfr]

    -- Biryukov, Alex. (Faculty of Mathematics and Computer Science, The Weizmann Institute of Science.) 2000. Methods of Cryptanalysis. "Lecture 1. Introduction to Cryptanalysis."

    Similarly, the term "product encipherment" is defined in Shannon 1949 (and is quoted here under Algebra of Secrecy Systems) as the use of one cipher, then another with independent keys. Thus, the independent key terminology was defined in cryptography over half a century ago, and probably 34 years before "cascade ciphering" was defined for the same idea without the key independence requirement. Both terms are commonly and legitimately confused in use. Anyone using the terms "cascade ciphering" or "product ciphering" would be well advised to explicitly state what the term is supposed to mean, or to not complain when someone takes it to mean something else.

    Cathode
    In an electronic circuit element, the end which accepts electrons and sources conventional current flow. The (+) end of a charged battery. The (-) end of a battery being charged. The N side of a PN semiconductor junction. As opposed to anode.

    CBC
    The Cipher Block Chaining block cipher operating mode.

    c.d.f.
    In statistics, cumulative distribution function. A function which gives the probability of obtaining a particular value or lower.

    Certify
    To attest to something being genuine. Also see authentication.

    Certification Authority
    CA. The third party needed by public key systems to certify public keys; to assure that the public key being used actually belongs to the desired party instead of a man-in-the-middle. Also see public key infrastructure.

    CFB
    The Ciphertext Feedback operating mode.

    Chain
    An operation repeated in a sequence, such that each result depends upon the previous result, or an initial value. One example is the CBC operating mode.

    Particularly inappropriate as a description of multiple encryption, because a physical "chain" is is only as strong as the weakest link, while a sequence of ciphers is as strong as the strongest link.

    Chance

    Something which happens unpredictably.

    Chaos

    The unexpected ability to find numerical relationships in physical processes formerly considered random. Typically these take the form of iterative applications of fairly simple computations. In a chaotic system, even tiny changes in state eventually lead to major changes in state; this is called "sensitive dependence on initial conditions." It has been argued that every good computational random number generator is "chaotic" in this sense.

    In physics, the state of an analog physical system cannot be fully measured, which always leaves some remaining uncertainty to be magnified on subsequent steps. And, in many cases, a physical system may be slightly affected by thermal noise and thus continue to accumulate new information into its state.

    In a computer, the state of the digital system is explicit and complete, and there is no uncertainty. No noise is accumulated. All operations are completely deterministic. This means that, in a computer, even a "chaotic" computation is completely predictable and repeatable.

    Characteristic
    The characteristic of a finite field is the number of times the multiplicative identity must be added to itself to produce zero. The characteristic will be some prime number p or, if the summation will never produce zero, the characteristic is said to be zero.

    Checkerboard Construction
    Finding or building a large number of different keyed Latin squares, and especially orthogonal Latin squares, can be extremely difficult. Although some simple constructions are available for statistical use, few of those support making huge numbers of essentially random squares. One solution which appears to be new to cryptography is what I call the Checkerboard Construction:

    One way to construct a larger square is to take some Latin square and replace each of the symbols or elements with a full Latin square. By giving the replacement squares different symbol sets, we can arrange for symbols to be unique in each row and column, and so produce a Latin square of larger size.

    If we consider squares with numeric symbols, we can give each replacement square an offset value, which is itself determined by a Latin square. We can obtain offset values by multiplying the elements of a square by its order:

       0 1 2 3            0  4  8 12
       1 2 3 0   * 4 =    4  8 12  0
       2 3 0 1            8 12  0  4
       3 0 1 2           12  0  4  8
    
    To simplify the example, we can use the same original square for all of the replacement squares:
       0+ 0 1 2 3    4+ 0 1 2 3    8+ 0 1 2 3   12+ 0 1 2 3
          1 2 3 0       1 2 3 0       1 2 3 0       1 2 3 0
          2 3 0 1       2 3 0 1       2 3 0 1       2 3 0 1
          3 0 1 2       3 0 1 2       3 0 1 2       3 0 1 2
    
       4+ 0 1 2 3    8+ 0 1 2 3   12+ 0 1 2 3    0+ 0 1 2 3
          1 2 3 0       1 2 3 0       1 2 3 0       1 2 3 0
          2 3 0 1       2 3 0 1       2 3 0 1       2 3 0 1
          3 0 1 2       3 0 1 2       3 0 1 2       3 0 1 2
    
       8+ 0 1 2 3   12+ 0 1 2 3    0+ 0 1 2 3    4+ 0 1 2 3
          1 2 3 0       1 2 3 0       1 2 3 0       1 2 3 0
          2 3 0 1       2 3 0 1       2 3 0 1       2 3 0 1
          3 0 1 2       3 0 1 2       3 0 1 2       3 0 1 2
    
      12+ 0 1 2 3    0+ 0 1 2 3    4+ 0 1 2 3    8+ 0 1 2 3
          1 2 3 0       1 2 3 0       1 2 3 0       1 2 3 0
          2 3 0 1       2 3 0 1       2 3 0 1       2 3 0 1
          3 0 1 2       3 0 1 2       3 0 1 2       3 0 1 2
    
    which produces the order-16 square:
       0  1  2  3    4  5  6  7    8  9 10 11   12 13 14 15
       1  2  3  0    5  6  7  4    9 10 11  8   13 14 15 12
       2  3  0  1    6  7  4  5   10 11  8  9   14 15 12 13
       3  0  1  2    7  4  5  6   11  8  9 10   15 12 13 14
    
       4  5  6  7    8  9 10 11   12 13 14 15    0  1  2  3
       5  6  7  4    9 10 11  8   13 14 15 12    1  2  3  0
       6  7  4  5   10 11  8  9   14 15 12 13    2  3  0  1
       7  4  5  6   11  8  9 10   15 12 13 14    3  0  1  2
    
       8  9 10 11   12 13 14 15    0  1  2  3    4  5  6  7
       9 10 11  8   13 14 15 12    1  2  3  0    5  6  7  4
      10 11  8  9   14 15 12 13    2  3  0  1    6  7  4  5
      11  8  9 10   15 12 13 14    3  0  1  2    7  4  5  6
    
      12 13 14 15    0  1  2  3    4  5  6  7    8  9 10 11
      13 14 15 12    1  2  3  0    5  6  7  4    9 10 11  8
      14 15 12 13    2  3  0  1    6  7  4  5   10 11  8  9
      15 12 13 14    3  0  1  2    7  4  5  6   11  8  9 10
    

    Clearly, this Latin square exhibits massive structure at all levels, but this is just a simple example. In practice we would create and use a different order-4 table for each position, and yet another for the offsets. We would also shuffle all rows, columns, and symbols in the larger square. And we could use order-16 squares to construct a keyed square of order-256. The result is a balanced table in a difficult to predict arrangement, a distinct selection from among a plethora of similar tables, and, thus, apparently ideal for cryptographic use as a Latin square combiner.

    Keyspace

    There are 576 Latin squares of order 4, any one of which can be used as any of the 16 replacement squares. The offset square is another order-4 square. So we can construct 57617 (about 8 x 1046 or 2155) different squares of order 16 in this way. Then we can shuffle the resulting square in 16! * 15! (about 2 x 1025 or 284) different ways, thus producing about 2 x 1072 squares, for about 240 bits of keying per square. (Even if we restrict ourselves to using only the 144 order 4 squares formed by shuffling a single standard square, we still have a 206-bit keyspace.) We could store two of the resulting order 16 "4 bit" squares in a 256-byte table for use as a "pseudo-8 bit" combiner, and might even select a combiner dynamically from an array of such tables.

    The construction is also applicable to orthogonal Latin squares. See my articles:

    Checksum
    An early and simple form of error detecting code. Commonly, an actual summation of data values in a register of some reasonable size like 16 bits. Unfortunately, addition is not particularly effective in detecting errors: In one massive set of experiments with real data, a 16-bit checksum was shown to detect errors about as well as a 10-bit CRC.

    Improved checksums (e.g., Fletcher's checksums) include both data values and data positions and may perform within a factor of 2 of CRC. One advantage of a true summation checksum is a minimal computation overhead in software (in hardware, a CRC is almost always smaller and faster). Another advantage is that when header values are changed in transit, a summation checksum is easily updated, whereas a CRC update is more complex and many implementations will simply re-scan the full data to get the new CRC.

    The term "checksum" is sometimes applied to any form of error detection, including more sophisticated codes like CRC.

    Chi-Square
    In statistics, a goodness of fit test used for comparing two distributions. Mainly used on nominal and ordinal measurements. Also see: Kolmogorov-Smirnov, one-sided test, and two-sided test.

    In the usual case, "many" random samples are counted by category or separated into value-range "bins." The reference distribution gives us the the number of values to expect in each bin. Then we compute a X2 test statistic related to the difference between the distributions:

       X2 = SUM( SQR(Observed[i] - Expected[i]) / Expected[i] )
    

    ("SQR" is the squaring function, and we require that each expectation not be zero.) Then we use a tabulation of chi-square statistic values to look up the probability that a particular X2 value or lower (in the c.d.f.) would occur by random sampling if both distributions were the same. The statistic also depends upon the "degrees of freedom," which is almost always one less than the final number of bins. See the chi-square section of the "Normal, Chi-Square and Kolmogorov-Smirnov Statistics Functions in JavaScript" page (locally, or @: http://www.ciphersbyritter.com/JAVASCRP/NORMCHIK.HTM#ChiSquare).

    The c.d.f. percentage for a particular chi-square value is the area of the statistic distribution to the left of the statistic value; this is the probability of obtaining that statistic value or less by random selection when testing two distributions which are exactly the same. Repeated trials which randomly sample two identical distributions should produce about the same number of X2 values in each quarter of the distribution (0% to 25%, 25% to 50%, 50% to 75%, and 75% to 100%). So if we repeatedly find only very high percentage values, we can assume that we are probing different distributions. And even a single very high percentage value would be a matter of some interest.

    Any statistic probability can be expressed either as the proportion of the area to the left of the statistic value (this is the "cumulative distribution function" or c.d.f.), or as the area to the right of the value (this is the "upper tail"). Using the upper tail representation for the X2 distribution can make sense because the usual chi-squared test is a "one tail" test where the decision is always made on the upper tail. But the "upper tail" has an opposite "sense" to the c.d.f., where higher statistic values always produce higher percentage values. Personally, I find it helpful to describe all statistics by their c.d.f., thus avoiding the use of a wrong "polarity" when interpreting any particular statistic. While it is easy enough to convert from the c.d.f. to the complement or vise versa (just subtract from 1.0), we can base our arguments on either form, since the statistical implications are the same.

    It is often unnecessary to use a statistical test if we just want to know whether a function is producing something like the expected distribution: We can look at the binned values and generally get a good idea about whether the distributions change in similar ways at similar places. A good rule-of-thumb is to expect chi-square totals similar to the number of bins, but distinctly different distributions often produce huge totals far beyond the values in any table, and computing an exact probability for such cases is simply irrelevant. On the other hand, it can be very useful to perform 20 to 40 independent experiments to look for a reasonable statistic distribution, rather than simply making a "yes / no" decision on the basis of what might turn out to be a rather unusual result.

    Since we are accumulating discrete bin-counts, any fractional expectation will always differ from any actual count. For example, suppose we expect an even distribution, but have many bins and so only accumulate enough samples to observe about 1 count for every 2 bins. In this situation, the absolute best sample we could hope to see would be something like (0,1,0,1,0,1,...), which would represent an even, balanced distribution over the range. But even in this best possible case we would still be off by half a count in each and every bin, so the chi-square result would not properly characterize this best possible sequence. Accordingly, we need to accumulate enough samples so that the quantization which occurs in binning does not appreciably affect the accuracy of the result. Normally I try to expect at least 10 counts in each bin.

    But when we have a reference distribution that trails off toward zero, inevitably there will be some bins with few counts. Taking more samples will just expand the range of bins, some of which will be lightly filled in any case. We can avoid quantization error by summing both the observations and expectations from multiple bins, until we get a reasonable expectation value (again, I like to see 10 counts or more). This allows the "tails" of the distribution to be more properly (and legitimately) characterized. (The technique of merging adjacent bins is sometimes called "collapsing.")

    Chosen Plaintext
    Defined plaintext.

    Cipher
    1. Any system which uses encryption; a cipher system.
    2. A key-selected secret transformation between plaintext and ciphertext. A key-selected function which takes plaintext to ciphertext, and an inverse function which takes that ciphertext back to the original plaintext.
    3. Specifically, a secrecy mechanism or process which operates on individual characters or bits independent of semantic content. As opposed to a secret code, which generally operates on words, phrases or sentences, each of which may carry some amount of complete meaning.

    Also see: cryptography, block cipher, stream cipher, substitution, permutation, cipher system, a cipher taxonomy, cryptanalysis, attack, security and strength.

    A good cipher can transform secret information into a multitude of different intermediate forms, each of which represents the original information. Any of these intermediate forms or ciphertexts can be produced by ciphering the information under some key value. The intent is that the original information only be exposed by one of the many possible keyed interpretations of that ciphertext. Yet the correct interpretation is available merely by deciphering under the appropriate key.

    A cipher appears to reduce the protection of secret information to enciphering under some key, and then keeping that key secret. This is a great reduction of effort and potential exposure, and is much like keeping your valuables in your house, and then locking the door when you leave. But there are also similar limitations and potential problems.

    With a good cipher, the resulting ciphertext can be stored or transmitted otherwise exposed without also exposing the secret information hidden inside. This means that ciphertext can be stored in, or transmitted through, systems which have no secrecy protection. For transmitted information, this also means that the cipher itself must be distributed in multiple places, so in general the cipher cannot be assumed to be secret. With a good cipher, only the deciphering key need be kept secret. (See: Kerckhoffs' requirements, but also security through obscurity.)

    Note that a cipher does not, in general, hide the length of a plaintext message, nor the fact that the message exists, nor when it was sent, nor, usually, the addressing to whom and from whom. Thus, even the theoretical one time pad (often said to be "proven unbreakable") does expose some information about the plaintext message. If message length is a significant risk, random amounts of padding can be added to confuse that, although padding can of course only increase message size, and is an overhead to the desired communications or storage. This typically would be handled at a level outside the cipher design proper, see cipher system.

    Cipher Engineering

    It is important to understand that ciphers are unlike any other modern product design, in that we cannot know when a cipher "works." For example:

    • A bridge is designed with beams of known strength. The strength of the resulting structure can be simulated and computed. When built, we can roll something heavy across and see that the bridge "works." Over time, we can develop trust that the bridge will work (as a bridge) when we need it.
    • A car "works" when it moves, and over time we can develop trust that it will move when and how we want. We know when it does not work.
    • A typical computer program does something and we can see the results, so we know if what we want actually occurs. Over time, we can build trust that the program will do what we want.
    • With a medicine, if we cannot see (or show) that it is working for us, we switch to something else.
    Ciphers are like none of those things because an opponent may break the cipher and not bother to tell us: we simply cannot know when a cipher "works." Since we do not know if a cipher is keeping our secrets safe, repeated use cannot build trust. And since we cannot see the outcome, we cannot know how to design a cipher to "guarantee" (in any sense at all) that the result will do what we need. In industrial quality-control terms, cipher design is literally "out of control." Also see: scientific method.

    Cipher Block Chaining
    (CBC). Cipher Block Chaining is an operating mode for block ciphers. CBC mode is essentially a crude meta-stream cipher which streams block-size transformations. As a consequence, the resulting meta-cipher is not a bijection like the block cipher component.

    In CBC mode the ciphertext value of the preceding block is exclusive-OR combined with the plaintext value for the current block. This randomization has the effect of distributing the resulting block values evenly among all possible block values, and so tends to prevent codebook attacks. But ciphering the first block generally requires an IV or initial value to start the process. And the IV necessarily expands the ciphertext by the size of the IV.

    [There are various possibilities other than CBC for avoiding plaintext block statistics in ciphers. One alternative is to pre-cipher, presumably with a different cipher and key, thus producing randomized plaintext blocks (see multiple encryption). Another alternative is to use a block at least 64 bytes wide, which, if it contains language text, can be expected to contain sufficient unknowable randomness to avoid codebook attacks (see huge block cipher advantages).]

    Note that the exposed nature of the CBC randomizer (the previous block ciphertext) does not hide plaintext or plaintext statistics. When simple deciphering exposes plaintext, the vast majority of possible plaintexts can be rejected automatically, based on their lack of bit-level and character and word structure. Normal CBC does not improve this situation much at all.

    CBC First Block Problems

    In CBC mode, each randomizing value is the ciphertext from each previous block. Clearly, all the ciphertext is exposed to the opponent, so there would seem to be little benefit associated with hiding the IV, which, after all, is just the first of these randomizing values. Clearly, in the usual case, if the opponent makes changes to a ciphertext block in transit, that will hopelessly garble two blocks (or perhaps just one) of the recovered plaintext. As a result, it is very unlikely that an opponent could make systematic changes in the plaintext simply by changing the ciphertext.

    But the IV is a special case: if the IV is not enciphered, and if the opponents can intercept and change the IV in transit, they can change the first-block plaintext bit-for-bit, without a block-wide garble. That means an opponent could make systematic changes to the first block plaintext simply by changing the IV. So, if the opponents know the first block plaintext (which could be a logo, name, date, or fixed dollar value), the stage is set for a potentially serious man-in-the-middle (MITM) problem. (The far more serious public-key MITM problem is an authentication failure with respect to the key, not an IV or data, and is a completely different issue.) Note that the CBC first block problem is completely independent of the cipher key and whether or not it changes, and is an even larger problem with the modern wider blocks used in AES.

    CBC First Block Solutions

    Despite howls of protest to the contrary, it is easy to see that the CBC first-block problem is a confidentiality problem, not an authentication problem. To see this, we simply note that all that is necessary to avoid the problem is to keep the IV secret. When the IV is protected, the opponent cannot know which changes to make to reach a desired plaintext. And, since the problem can be fixed without any authentication at all, it is clear that the problem was not a lack of authentication in the first place. Instead, the problem was caused by exposing the IV, and solving that is the appropriate province of the CBC and block level, instead of a MAC at the cipher system and message level.

    To fix the CBC first-block problem it is not necessary to check the plaintext for changes by using a MAC. Nor is a MAC necessarily the only way to authenticate a message. But if we are going to use a MAC anyway, that is one way to solve the problem. That works because a MAC can detect the systematic changes which a lack of confidentiality may have allowed to occur. But if a MAC is not otherwise desired, introducing a MAC to solve the CBC first-block problem is probably overkill, because only the block-wide IV needs to be protected, and not the entire message.

    The reason we might not want to use a MAC is that a MAC carries some inherent negative consequences. One of those is a processing latency, in that we cannot validate the recovered plaintext until we get to the end and check the digest. Latency can be a serious problem with streaming data like audio and video, and with interactive protocols. But even with an email message we have to buffer the whole message as decrypted and wait for the incoming data to finish before we can do anything with it (or we can make encryption hard and decryption easy, but one side will be a problem). Or we can set up some sort of packet structure with localized integrity checks and ciphertext expansion in each packet. But that seems like a lot of trouble when an alternative is just to encipher the IV.

    Even when a MAC is used at a higher level anyway, it may be important for Software Engineering and modular code construction to handle at the CBC level as many of the problems which CBC creates as possible. This avoids forcing the problem on, and depending upon a correct response from, some unknown programmer at the higher level, who may have other things on the mind. Handling security problems where they occur and not passing them on to a higher layer is an appropriate strategy for security programming.

    As the problems compound themselves, it seems legitimate to point out that the CBC first-block problem is a CBC-level security issue caused by CBC and by transporting the IV in the open. The CBC first-block problem is easily prevented simply by transporting the IV securely, by encrypting the IV before including it with the ciphertext. Also see "The IV in Block Cipher CBC Mode" conversation (locally, or @: http://www.ciphersbyritter.com/NEWS6/CBCIV.HTM).

    Ciphering
    The use of a cipher. The general term which includes both enciphering and deciphering.

    Cipher System
    A larger system which includes one or more ciphers and generally involves much more, including:
    • Key Management. Facilities to support secure key loading, user storage, use, re-use, archival storage, loss, and destruction. Also possibly facilities for secure key creation and transport encryption. Key storage may involve creating a password-encrypted mini-database of actual key values selected by open alias (see: alias file). Key transport may involve a full public key component, with its own key construction, storage, use, loss and destruction reqirements, especially including some form of cryptographic key certification, and possibly including a large, complex, and expensive certification infrastructure. (Also see hybrid.) Also see the Cloak2 documents for implemented key management features (locally, or @: http://www.ciphersbyritter.com/CLO2FEA.HTM) and alias file usage (locally, or @: http://www.ciphersbyritter.com/PROD/CLO2DOC3.HTM#DetAF).
    • Keyphrase Hashing. In general, language phrases must be converted to randomized and dense binary values for cipher use. In general, a common hash such as CRC is sufficient for the task, and a cryptographic hash is not required.
    • Message Key Creation and Use. In general, a random value is used as the key for the actual data. Typically, the random message key value will be the only thing encrypted by the key from the alias file, or the public key component.
    • Unknowable Randomness Creation or Collection. For message keys, other keys and possibly other protocols.
    • Message Integrity Coding. Detect when a message has changed in transit. This could be a MAC, or even a simple hash protected by a conventional block cipher.
    • Cipher Selection. A cipher system need not always use the same cipher, and a multiple encryption system may use different ciphers in different sequences. A dynamic selection protocol.
    • Data Compression. Messages often can be compressed, leading to shorter messages and apparently more-complex plaintext. (But if the compression method is known and unkeyed, compression may not add much cryptographic advantage.) Data compression may help obscure the original message length, however.
    • Message Length and Position Concealment. In general, ciphers conceal message data, not message length. That can be a serious security hazzard which may need to be addressed. Random amounts of random data (e.g., nulls) can be added to the top and bottom of a message, or even inside the data itself. Null positions can be keyed and then might essentially constitute another cipher layer in a multiple encryption system.
    • Data Ciphering. The actual ciphers themselves, keyed by the transported and decrypted message key.
    • Data Length Limitation and Re-Keying. Most ciphers can securely handle only some maximum amount of data before the keying must change to preserve security. For example, if an opponent collects a codebook of known-plaintext ciphertexts, we can expect that to become useful at something like the square-root of the number of different block values (see birthday attack). So a 64-bit block cipher probably should be limited to ciphering 2**32 blocks or less under a single key. Since various academic attacks on DES need something like 2**47 known-plaintexts, simply limiting the amount of data processed under one key prevents those attacks. Indeed, many academic attacks require a huge amount of known plaintext or even defined plaintext ciphered under a single key. By supporting automatic re-keying, the cipher system can assure that the required message volume simply cannot exist.
    • IV Creation and Use. Most operating modes for conventional block ciphers need an IV, which then (usually) needs to be protected and made part of the ciphertext.
    • Block Partitioning and End Handling. Messages of arbitrary length need to be partitioned into the fixed-size blocks used by conventional block ciphers, and any remaining partial-block of data padded into a full block or otherwise handled. It is important that any such solution handle 1-byte messages.
    • Secure File Overwrite. Operating systems generally do not erase files, but instead simply make the deleted file space available for use. It is possible to read "deleted" data from a disk until the released sectors are actively overwritten.
    • Secure Memory Overwrite. Memory buffers holding keys and data should be cleared by active overwrite as soon as their use is complete.
    • Secure Plaintext Creation and Display. Unless plaintext is "born secure," the operating system is going to be a serious security problem. A multitasking "swap file" is often a major security risk.

    Also see traffic analysis, Software Engineering, Structured Programming, and comments in the "Cipher Review Service" document (locally, or @: http://www.ciphersbyritter.com/CIPHREVU.HTM).

    A Cipher Taxonomy
    Taxonomy is classification or grouping by common characteristics, instead of by name, development path, or surface similarity. Developing a useful taxonomy is one of the goals and roles of science. The advantage of a taxonomy is that we can study general concepts which then apply to the many different things which fit in a single group. We can also compare and contrast different groups and their effects on the things we study. Ideally, a taxonomy will support future developments without major changes in structure. One way to do that is to start with a dichotomy, which separates the universe of all possibilities into exactly two groups. Not only will all known things fit into one or the other of those groups, but all future developments also will fit in those same groups. Fairly quickly, though, we must turn to enumeration to describe distinct sub-classes.

    For the analysis of cipher operation it is useful to collect ciphers into groups based on their functioning (or intended functioning). The goal is to group ciphers which are essentially similar, so that as we gain an understanding of one cipher, we can apply that understanding to others in the same group. We thus classify not by the components which make up the cipher, but instead on the "black-box" operation of the cipher itself.

    We seek to hide distinctions of size, because operation is independent of size, and because size effects are usually straightforward. We thus classify conventional block ciphers as keyed simple substitution, just like newspaper amusement ciphers, despite their obvious differences in strength and construction. This allows us to compare the results from an ideal tiny cipher to those from a large cipher construction; the grouping thus can provide benchmark characteristics for measuring large cipher constructions.

    We could of course treat each cipher as an entity unto itself, or relate ciphers by their dates of discovery, the tree of developments which produced them, or by known strength. But each of these criteria is more or less limited to telling us "this cipher is what it is." We already know that. What we want to know is what other ciphers function in a similar way, and then whatever is known about those ciphers. In this way, every cipher need not be an island unto itself, but instead can be judged and compared in a related community of similar techniques.

    Our primary distinction is between ciphers which handle all the data at once (block ciphers), and those which handle some, then some more, then some more (stream ciphers). We thus see the usual repeated use of a block cipher as a stream meta-cipher which has the block cipher as a component. It is also possible for a stream cipher to be re-keyed or re-originate frequently, and so appear to operate on "blocks." Such a cipher, however, would not have the overall diffusion we normally associate with a block cipher, and so might usefully be regarded as a stream meta-cipher with a stream cipher component.

    The goal is not to give each cipher a label, but instead to seek insight. Each cipher in a particular general class carries with it the consequences of that class. And because these groupings ignore size, we are free to generalize from the small to the large and so predict effects which may be unnoticed in full-size ciphers.

    1. BLOCK CIPHER
      • Requires the accumulation of more than one data element for ciphering.
      • Individual data elements cannot be ciphered independently.
      • Real-time information of dynamic size may mean waiting until more information completes a previous block.
      • (Sometimes stream ciphers accumulate data for convenience, as in cylinder ciphers, but nevertheless logically cipher each character independently.)
      • Probably will require padding to fill out a full block when no more data remains, which is ciphertext expansion.
      • Probably will require some added information to allow padding to be distinguished from data and removed, which is more expansion.
      • Generally reveals message length, with an uncertainty only of block size.

      1. SIMPLE SUBSTITUTION
        • Examples: Newspaper cryptograms, grade school ciphers, codebooks, and conventional block ciphers in electronic codebook mode.
        • A full bijective, invertible, non-expanding mapping.
        • Keying constitutes a permutation or re-arrangement of the table of code values.
        • Since the multiple data elements of a block combine to select a table entry, the smallest possible change to any one of those data elements should select a new and apparently random value, and thus "affect" the full ciphertext block. This is avalanche.
        • Binary-oriented simple substitution distributes bit-changes between all code values binomially, and this effect can be sampled and examined statistically, for a simple substitution signature.
        • Avalanche is two-way diffusion in the sense that "later" plaintext can change "earlier" ciphertext, within a single block, which is also a signature.
        • A conventional block cipher is intended to simulate a keyed substitution table of a size which must be vastly larger than anything which could be practically realized. At issue is the quality of that simulation.
        • Few conventional block cipher designs are scalable to toy size which would support exhaustive testing.
        • Any substitution table becomes weak when its code values are re-used.
        • Code value re-use can be minimized by randomizing the plaintext block (e.g., CBC). This distributes the plaintext evenly across the possible block values, but at some point the transformation itself must change or be exposed. And open randomization does not hide plaintext structure from brute force attack.
        • Another alternative is to have a huge block size so that code value re-use is made exceedingly unlikely. A large block also has room for a dynamic keying field which would make code value re-use even more unlikely. (Also see huge block cipher advantages.)

      2. TRANSPOSITION CIPHER
        • Example: Dynamic Transposition.
        • Arguably a special case of Simple Substitution with dynamic keying, but with sufficiently different signatures and characteristics to treat independently.
        • All elements to be ciphered first must be collected; this is the block cipher signature.
        • Avalanche is neither present, nor needed, nor helpful so the Simple Substitution signature does not exist.
        • By itself, transposition has various known weaknesses, which can be avoided by design. But unless that is done, we do not have a serious transposition cipher. As a consequence, we have just one example of the group properties.
        • Secure keying is necessarily dynamic, a block-by-block permutation of plaintext elements.
        • Ciphering by re-ordering elements necessarily requires that not all elements have the same value.
        • Ideally, the data should be value-balanced. Balance can be enforced by pre-coding or scrambler randomization.
        • Best transposition ciphering is by bit, not byte.
        • Blocks can be of variable size.
        • These characteristics combine to allow a plethora of different permutations to produce the exact same ciphertext result, thus obscuring the actual permutation involved. This hides the keying sequence.

      3. HOMOPHONIC SUBSTITUTION
        • No known example.
        • One construction is to have a a "homophonic" field as part of the plaintext block. A random value in that field thus selects a particular ciphertext from the many which each reproduce exactly the same "data" field.
        • Arguably a special case of Simple Substitution, but it is not clear that homophonic ciphers could not be built in other ways.
        • A ciphertext expanding cipher.
        • Multiple ciphertexts map to the same plaintext.
        • Coding redundancy can be used to convey per-block authentication or dynamic keying. (See huge block cipher advantages.)
        • Keying constitutes a permutation or re-arrangement of the fixed set of possible code values, plus the dynamic selection of a particular homophonic ciphertext.

      4. DYNAMICALLY SELECTABLE BLOCK SIZE
        • Block size selectable on a real-time block-by-block basis.
        • Block size typically available in powers-of-2 elements, subject to some strength minimum (e.g., 8 bytes).
        • Supports the block size in existing designs, modern larger blocks, and stronger huge blocks, all in exactly the same software executable code.
        • Huge blocks support efficient per-block authentication, homophonic operation and per-block dynamic keying. (See huge block cipher advantages.)
        • Supports real-time information of dynamic size without using chaining (e.g., CBC).
        • Can minimize padding expansion with just one block of each smaller block size.
        • Huge blocks can support ECB mode, thus avoiding the need for an IV and associated ciphertext expansion.
        • Hardware implementations can feature a constant execution time per block, making huge blocks much faster per byte than small blocks.
        • Scalable down to toy size for exhaustive testing.
        • My Mixing Ciphers may have introduced this concept to open cryptography (see my Balanced Block Mixing development articles, starting in March, 1994).

      5. DYNAMICALLY VARIABLE SIZE BLOCK

    2. STREAM CIPHER
      • Can (but may choose not to) cipher individual data elements immediately, as they arrive. This is a stream cipher signature, and can be identified by analysis of the design.
      • Does not need to fill a block, so does not need block padding.
      • Does not need padding, so does not need a padding removal structure.
      • The basic concept of streaming does not need a confusion sequence or a RNG generator.
      • Generally does reveal message length.
      • Does not need data diffusion or avalanche.
      • May have data diffusion, but if so, it is necessarily one-way (from earlier to later elements). This is a stream cipher indication.
      • The simple re-use of a block cipher to cover more data than a single block is a stream meta-cipher.

      1. CONFUSION SEQUENCE
        • The classic character-by-character confusion sequence and simple additive combiner cipher.
        • Keying is the sequence itself, or the state which selects a particuar sequence in an RNG generator.
        • With an unpredictable sequence we have a one time pad.
        • With a pseudorandom confusion sequence (from an RNG) we have a classic Vernam cipher.
        • Unfortunately, an additive combiner immediately exposes the confusion sequence under known plaintext attack, and so contributes no strength at all beyond mere balanced mixing.
        • Since the structure of the RNG is assumed to be known, an exposed confusion sequence supports attempts to reconstruct the RNG state.
        • Stronger combiners include nonlinear and keyable combiners with state, that do not immediately expose the confusion sequence.
        • More complex combiners may imply the need for correspondingly less strong confusion sequences.
        • RNG-based stream ciphers do need a message key or something of similar nature to prevent confusion sequence re-use.
        • Having a message key generally implies that amount of ciphertext expansion.

        1. Autokey
          • A confusion sequence generator (RNG) which is not closed and "free running," but which is affected by the ciphertext.
          • Ciphertext, or possibly plaintext, modifies the state in the RNG and thus the subsequent keystream.
          • Different messages sent under the same key end up having different confusion sequences without any message key.
          • If ciphertext essentially becomes the entire RNG state, we can create a random-like confusion stream which will re-synchronize after ciphertext data loss. However, data loss is rarely an applications-level issue in modern communications systems.
          • Under known plaintext attack, the common "ciphertext feedback" form exposes both the output from, and the input to, the confusion sequence RNG, which puts a lot of pressure on RNG strength.

      2. FILTERING
        • No known example.
        • The plaintext data are directly processed.
        • No confusion sequence is generated.
        • Presumably some sort of shift register with feedback.
        • Presumably something like a digital filter.

      3. MONOALPHABETIC (e.g., DES CBC)
        • The repeated use of a single, fixed substitution table.
        • Examples include: newspaper cryptograms, grade school ciphers, codebooks, and conventional block ciphers in ECB mode.
        • A substitution becomes weak when its code values are re-used.
        • Code value re-use can be minimized by randomizing the plaintext block (e.g., CBC). This distributes the plaintext evenly across the possible block values, but at some point the transformation itself must change or be exposed. And open randomization does not hide plaintext structure from brute force attack.
        • Another alternative is to use a very large block so that code value re-use is made exceedingly unlikely. A large block also has room for a dynamic keying field which would make code value re-use even more unlikely. (See huge block cipher advantages.)

      4. POLYALPHABETIC
        • The repeated use of multiple fixed substitution tables with an implicit or rotating confusion sequence.
        • By itself, the use of a known number of multiple alphabets in a regular sequence is not much stronger than a single alphabet.
        • It is of course possible to select an alphabet at pseudo-random, for example by re-keying DES after every block ciphered. This requires an RNG and an IV to select the starting state. Re-keying DES will take some time, however.
        • An improvement over random alphabets is the use of a Latin square combiner which effectively selects among a balanced set of different fixed substitution alphabets.

        1. Cylinder
          • A cipher which has or simulates multiple alphabet disks on a single rod. The plaintext message is entered one letter per disk by turning each disk so the correct letter shows in a particular row. The ciphertext is read off some other row.
          • Although operation typically occurs in "chunks" which fill up the cylinder, a full block is not required and individual characters can be ciphered. This is the stream cipher signature.
          • Primary keying is the arrangement of the alphabet around each disk, and the selection and arrangement of disks on the rod.
          • By entering the plaintext on one row, any of n-1 other rows can be sent as ciphertext; this selection is an IV.
          • If the plaintext data are redundant, it is possible to avoid sending the IV by selecting the one of n-1 possible decipherings which shows redundancy. But this is not generally possible when ciphering arbitrary binary data.
          • If an IV is selected first, each character ciphering in that "chunk" is independent of each other ciphering. There is no data diffusion.
          • In general, each disk is used at fixed periodic intervals through the text, which is weak.
          • The ciphertext selection is homophonic, in the sense that different ciphertext rows each represent exactly the same plaintext.
          • Cylinder operation is not polyphonic in the usual sense: While a single ciphertext can imply any other row is plaintext, generally only one row has a reasonable plaintext meaning.

      5. DYNAMIC
        • The use of one (monoalphabetic) or multiple (polyalphabetic) substitution tables whose contents change during ciphering.
        • Keying constitutes a starting permutation or arrangement of the table of code values.
        • An explicit substitution table which is efficiently re-keyed after every use.
        • Used as the combiner part of a conventional confusion sequence stream cipher, or to combine multiple RNG's.
        • Dynamic Substitution is important because it directly confronts the classic attack on conventional stream ciphers: The usual additive combiner immediately exposes the confusion sequence under known plaintext attack. Since the structure of the RNG is assumed to be known, an exposed confusion sequence supports attempts to reconstruct the RNG state. In contrast, a Dynamic Substitution combiner does not expose the confusion sequence, which should make the cipher stronger.

      6. ITERATIVE
        • Multiple encryption using one stream cipher repeatedly with a new random IV on each iteration so as to eventually achieve the effect of a much larger message key.
        • Each iteration seemingly must expand the ciphertext by the size of the IV, although this is probably about the same expansion we would have with a message key.
        • Unfortunately, each iteration will take some time.
        • Particularly appropriate for a cylinder cipher, as shown by W.T. Shaw.

    Cipher Testing
    If a cipher was like any other technological construction, we could just test it to see how good it was. Unfortunately, cipher strength is not a measurable engineering quantity, and as a result, strength is simply "out of control" with respect to design and manufacturing quality. However, some basic tests can and should be done to weed out the most obvious problems.

    In general, absent special coding for transmission (such as converting full binary into base-64 for email) ciphertext should be "random-like." Accordingly, we can run all sorts of tests to try to find any sort of structure or correlation in the ciphertext, or between plaintext, key, and ciphertext. The many available statistical randomness tests should provide ample opportunity for virtually unlimited testing.

    Testing Conventional Block Ciphers

    The usual or conventional block cipher is intended to emulate a huge, keyed, substitution table. Mathematically, such a function is a bijection, and the symbols in the table are a permutation. These structures might be measured, at least in theory. But very few conventional block ciphers are scalable to tiny size, and the vast size of a real block cipher allows only statistical sampling.

    One obvious issue in block cipher construction is diffusion. If the resulting emulated table really is a permutation, if we change the input value in any way, we expect the number of bits which change in the output to occur in a binomial distribution. In addition, we expect each output bit to have a 50 percent probability of changing. We can measure these things.

    Typically, we pick some random input value and cipher to get the result; then we change some bit of the input and get the new result and note which and how many bits changed. One advantage of the binomial distribution is that, as block size increases, the distribution becomes increaingly narrow (for any reasonable probability). Thus, we can hope to peer into tremendously small probabilities, which may be about as much error as we can expect to find.

    We also can develop a mean value for each output bit, or analyze a particular bit more closely, looking for correlations between input and output, or between key and output, or between the key and some aspect of the transformation between input and output. We might look at correlations between each key bit and each output bit, or between any combination of key bits versus any combination of output bits and so on. With increasingly large experiments, we can perform increasingly fine statistical analyses.

    An issue of at least potential concern is that conventional block cipher designs do not implement a completely keyed transformation , but instead implement only a tiny, tiny fraction of all possible tables of the block size. This opens the possibility of weakness in some form of correlation resulting from a tiny subset of implemented permutations. The issue then becomes one of trying to measure possible structural correlations between the set of implemented permutations and the key, including individual bits, or even arbitrary functions of arbitrary multiple bits. At real cipher size, such measurements will be difficult. Or perhaps knowledge of some subset of the transformation could lead to filling out the rest of the transformation; at real cipher size, this may be very difficult to see.

    Block Cipher Scalability

    Cipher designs which are scalable can be tested at real size when that is useful, or as tiny "toy" versions, when that is useful. Naturally, the tiny versions are not intended to be as strong as the real-size versions, nor even to be a useful cipher at that size. One purpose is to support exhaustive correlation testing to reveal structural problems which should be easier to discern in the smaller construction. The goal would be to find fault at the tiny size, and then use that to develop insight leading to a scalable attack. That same insight also should help improve the cipher design.

    One advantage of scalability is to support attacks on the same cipher at different sizes. Once we find an attack on a toy-size version, we can measure how hard that approach really is by actually doing it. Then we can scale up the cipher slightly and measure how much the difficulty has increased. That can provide true evidence which can be used to extrapolate the strength of the real-size cipher, under the given attack. I see this as vastly more believable information than we have for current ciphers.

    Another thing we might do is to measure Boolean function nonlinearity values. This measure at least has the advantage of directly addressing one form of strength: the linear predictability of each key-selected permutation.

    Yet another thing we might investigate is the number of keys that are actually different. That is, do any keys produce the same emulated table, and if not, how close are those tables? Can we find any two keys that produce the same ciphertext from the same plaintext? (See population estimation and multiple encryption.)

    Testing Stream Ciphers

    The conventional stream cipher consists of a keyed RNG or confusion generator and some sort of data and confusion combiner, usually exclusive-OR. Since exclusive-OR has absolutely no strength of its own, the strength of the classic stream cipher depends solely on the RNG. Such testing is a common activity in cryptography, using various available statistical randomness tests. (But recall that many strengthless statistical RNG's do well on such tests.) I particularly recommend runs up/down, because we can develop a useful non-flat distribution of results and then compare that to the theoretical expectation. We can do similar things with birthday tests, which are also useful in confirming the coding efficiency or entropy of really random generators.

    Modern stream ciphers with nonlinear combiners (see, for example: Dynamic Substitution) seem harder to test. Presumably we can test the ciphertext for randomness, as usual, yet that would not distinguish between the combiner and the RNG. Possibly we could test the combiner with RNG, and then the RNG separately, and compare distributions. However, it is not clear what sort of tests would provide useful insight to this construction. Alternate suggestions are welcomed.

    Ciphertext
    The result of encryption. As opposed to plaintext.

    Ciphertext contains the same information as the original plaintext, hopefully in a form which cannot be easily understood. Cryptography hides information by transforming a plaintext message into any one of a vast multitude of different ciphertexts, as selected by a key. Ciphertext thus can be seen as a code, in which the exact same ciphertext has a vast number of different plaintext interpretations. As a goal, it should be impractical to know which interpretation represents the original plaintext without knowing the key.

    Normally, ciphertext will appear random; the values in the ciphertext should occur in a generally balanced way. Normally, we do not expect ciphertext to compress to a smaller size; that implies efficient coding (also see entropy), but only for the random-like ciphertext. Since the amount of plaintext information in the message may be far smaller, from that point of view the ciphertext coding may be very inefficient.

    It also may happen that the ciphertext can be encoded inefficiently (perhaps as base-64 for email transmission). Note that such encoding does not require distinct steps for ciphering and then encoding: Some ciphers directly produce encoded (and, thus, expanded) ciphertext (see, for example: Penknife). Such ciphertext will be compressible, simply because representing information with a subset of ASCII characters is inherently less efficient than a binary representation. Thus, inefficiently coded ciphertext may well compress, and that does not imply weakness in the cipher itself.

    Ciphertext Expansion
    When the ciphertext is larger than the original plaintext. Also see IV, block code, nulls, Braid, Homophonic Substitution, and Penknife.

    Ciphertext expansion is the general situation: Stream ciphers need a message key, and block ciphers with a small block need some form of plaintext randomization, which generally needs an IV to protect the first block. Only block ciphers with a large size block generally can avoid ciphertext expansion, and then only if each block can be expected to hold sufficient uniqueness or entropy to prevent a codebook attack.

    It is certainly true that in most situations of new construction a few extra bytes are not going to be a problem. However, in some situations, and especially when a cipher is to be installed into an existing system, the ability to encipher data without requiring additional storage can be a big advantage. Ciphering data without expansion supports the ciphering of data structures which have been defined and fixed by the rest of the system, provided only that one can place the cipher at the interface "between" two parts of the system. This is also especially efficient, as it avoids the process of acquiring a different, larger, amount of store for each ciphering. Such an installation also can apply to the entire system, and not require the re-engineering of all applications to support cryptography in each one.

    Ciphertext Feedback
    1. A way of producing an autokey stream cipher, by using ciphertext values to modify the internal state of the keyed RNG keystream generator.
    2. CFB. Ciphertext Feedback is an operating mode for a block cipher.

    CFB is closely related to OFB, and is intended to provide some of the characteristics of a stream cipher from a block cipher. CFB generally forms an autokey stream cipher. CFB is a way of using a block cipher to form a random number generator. The resulting pseudorandom confusion sequence can be combined with data as in the usual stream cipher.

    CFB assumes a shift register of the block cipher block size. An IV or initial value first fills the register, and then is ciphered. Part of the result, often just a single byte, is used to cipher data, and the resulting ciphertext is also shifted into the register. The new register value is ciphered, producing another confusion value for use in stream ciphering.

    One disadvantage of this, of course, is the need for a full block-wide ciphering operation, typically for each data byte ciphered. The advantage is the ability to cipher individual characters, instead of requiring accumulation into a block before processing.

    Ciphertext Only
    The information condition of an opponent knowing only the ciphertext for an encryption, without any of the related plaintext. Also see known plaintext, defined plaintext and ciphertext only attack.

    Ciphertext Only Attack
    Any attack which takes place under ciphertext only information conditions. Implicitly, however, the opponent also must know something about the plaintext, or it will be impossible to know when a deciphering is correct. Also see known plaintext attack and defined plaintext attack.

    In a sense, the idea of a ciphertext-only attack is inherently incomplete. By themselves, symbols and code values have no meaning. So we can have all the ciphertext we want, but unless we can find some sort of structure or relationship to plaintext, we have nothing at all. The extra information necessary to identify a break could be the bit structure in the ASCII code, the character structure of language, or any other known relation. But the ciphertext is never enough if we know absolutely nothing about the plaintext. It is our knowledge or insight about the plaintext, the statistical structure, or even just the known use of one plaintext concept, that allows us to know when deciphering is correct.

    In practice, ciphertext-only attacks typically depend on some error or weakness in the encryption design which somehow relates some aspect of plaintext in the ciphertext. For example, codes that always encrypt the same words in the same way naturally leak information about how often those words are used, which should be enough to identify the plaintext. And the more words identified, the easier it is to fill in the gaps in sentences, and, thus, identify still more words. Modern ciphers are less likely to fall into that particular trap, making ciphertext-only attacks generally more academic than realistic (also see break).

    Ciphony
    Audio or voice encryption. A contraction of "ciphered telephony."

    Circuit
    In electronics, the "circular" flow of electrons from a power source, through conductors and components and back to the power source. Or the arrangement of components which allows such flow and performs some function.

    Claim
    1. In commerce, an assertion of value due.
    2. In law, a statement of ownership, as in recovering a lost item, or filing for mining or patent rights. Also see patent claims.
    3. In argumentation, a conclusion to be shown correct.

    Cleartext
    Generally speaking, plaintext. Messages transmitted without encryption or "in the clear." For time-sensitive and transient tactical information, getting the message through as soon as possible may be far more important than secrecy.

    Cloak2
    A DOS-era second-generation secret key stream cipher which I designed and implemented. Based on the nonlinear Dynamic Substitution combiner technology which I invented, patented (see locally, or @: http://www.ciphersbyritter.com/PATS/DYNSBPAT.HTM), and own. More than just a cipher, a full cryptosystem with various cipher system features:
    • Huge 992-bit keyspace.
    • Huge keystream Additive RNG with 310,048 bits of state and jitterizer nonlinear isolation.
    • Two sequential keyed Dynamic Substitution combinings.
    • The second combining is selected on a byte-by-byte basis from among 16 keyed Dynamic Substitution combiners.
    • Ciphertext can be concatenated.
    • Strong key-management by alias file.
    • Ciphers individual files, multiple files, or an entire disk.
    • Supports user-programmed batch operations.

    See the documentation:

    Also see Penknife.

    Clock
    A repetitive or cyclic timing signal to coordinate state changes in a digital system. A clock coordinates the movement of data and results through various stages of processing. Although a clock signal is digital, the source of the repetitive signal is almost always an analog circuit, typically a crystal oscillator.

    In a digital system we create a delay or measure time by simply counting pulses from a stable oscillator. Since counting operations are digital, noise effects are virtually eliminated, and we can easily create accurate delays which are as long as the count in any counter we can build.

    Closure
    When an operation on a set produces only elements in that set.

    Code
    From the Latin codex, for tree trunk or wax-covered wooden tablet.
    1. A list of rules.
    2. In cryptography, symbols, colors, shapes, flowers, musical notes, finger positions or values which stand for symbols, values, words, sentences, ideas, sequences, or even operations (as in computer "opcodes"). Often just a simple substitution between numeric values.

    Code values can easily represent not only symbols or characters, but also words, names, phrases, and entire sentences (also see nomenclator). In contrast, a cipher operates only on individual characters or bits. Classically, the meaning of each code value was collected in a codebook. Codes may be open (public) or secret.

    Coding is a very basic part of modern computation and generally implies no secrecy or information hiding. In modern usage, a code is often simply a correspondence between information (such as character symbols) and values (such as the ASCII code or Base-64). Because a code can represent entire phrases with a single number, one early application for a public code was to decrease the cost of telegraph messages.

    In general, secret codes are weaker than ciphers, because a typical code will simply substitute or transform each different word or letter into a corresponding value. Thus, the most-used plaintext words or letters also become the most-used code or ciphertext values and the statistical structure of the plaintext remains exposed. Then the opponent easily can find the most-used ciphertext values and realize that they represent the most-used plaintext words. Accordingly, it is common to superencipher a coded message in an attempt to hide the codebook values.

    A meaningful code is more than just data, being also the interpretation of that data. The main concept of modern cryptography is the use of a key to select one interpretation from among vast numbers of different interpretations, so that meaning is hidden from those who do not have both the appropriate decryption program and key. Each particular ciphertext is interpreted by the decryption system to produce the desired plaintext. The pairing of value plus interpretation to produce or do something occurs in various places:

    • In cryptography, we have ciphertext plus the key-selected transformation which interprets that ciphertext.
    • In cryptanalysis, we have pseudorandom sequences, plus the finite state machine and state which creates a particular sequence.
    • In computer hardware, we have opcode values plus a computer which interprets those different values as different operation commands.
    • In computer programming, we typically have raw binary data plus procedures which interpret that data.
    • In computer-based documents, we have various formats (e.g., .DOC or .PIF files) plus a program to properly display those documents.
    • In biology we have DNA plus the cell which interprets DNA, both being required to express the original meaning.

    In real life, many useful things do require a particular thing to use them. For example, gasoline provides energy for cars, but only because cars have the appropriate engine to perform the desired conversion. Similarly, bullets require guns, radio broadcasting stations require radios and so on. But that probably reaches beyond the idea of a code, which is basically limited to information- or symbol-oriented transformations.

    Codebook
    Literally, the listing or "book" of code transformations. More generally, any collection of such transformations. Classically, letters, common words and useful phrases were numbered in a codebook; messages transformed into those numbers were "coded messages." Also see nomenclator. A "codebook style cipher" refers to a block cipher.

    Codebook Attack
    A form of attack in which the opponent simply tries to build or collect a codebook of all the possible transformations between plaintext and ciphertext (under a single key). This is the classic approach we normally think of as "codebreaking." Also called "bookbreaking."

    The usual ciphertext only approach depends upon the plaintext having strong statistical biases which make some values far more probable than others, and also more probable in the context of particular preceding known values. While this is not known plaintext, it is a form of known structure in the plaintext. Such attacks can be defeated if the plaintext data are randomized and thus evenly and independently distributed among the possible values (see balance).

    When a codebook attack is possible on a block cipher, the complexity of the attack is controlled by the size of the block (that is, the number of elements in the codebook) and not the strength of the cipher. This means that a codebook attack would be equally effective against either DES or Triple-DES.

    One way a block cipher can avoid a codebook attack is by having a large block size which will contain an unsearchable amount of plaintext "uniqueness" or entropy. Another approach is to randomize the plaintext block, by using an operating mode such as CBC, or multiple encryption. Yet another approach is to change the key frequently, which is one role of the message key introduced at the cipher system level.

    Codebreaking
    Specifically, the work of attempting to attack and break a secret code. More generally, attempting to defeat any kind of secrecy system.

    Codebreaking is what we normally think of when hearing the WWII crypto stories, especially the Battle of Midway, because many secrecy systems of the time were codes. According to the story, the Japanese are preparing an attack on Midway island, and have given Midway the coded designation "AF." American cryptanalysts have exposed the designator "AF," but not what it represents. Assuming the "AF" to be Midway, American codebreakers have Midway falsely report the failure of their fresh-water plant in open traffic. Then, two days later, intercepted Japanese traffic states that "AF" is short of fresh water. Thus, "AF" is confirmed as Midway.

    Note that there had to be a way to identify the actual target (plaintext) with the code value (ciphertext) before the meaning was exposed. Simply having the ciphertext itself, without finding structure in the ciphertext or some relationship to plaintext, is almost never enough, see ciphertext-only attack.

    Codeword
    In coding theory, a particular sequence or block of n elements, each element taken from alphabet A. A particular block-encoded result value.

    Coding Theory
    The engineering field concerned with encoding data for communications and storage. Different codings have various properties, including:
    • Size
    • Error-detection
    • Error-correction
    • Multiple logical channels
    • Control codes
    • Bit balance
    See block code and entropy.

    Coefficient
    In a mathematical expression, a factor of a term. Typically a constant value or simple variable which multiplies the parameter of interest (often X). However, any proper subset grouping of factors technically would be a coefficient of the overall product.

    Cognitive Dissonance
    The uncomfortable psychological reaction to the experience of finding that some of our core beliefs are factually wrong.

    The classic example is of a cult who believed the Earth was going to end at a particular time. Supposedly, many members gave up their houses and jobs and so on, but the Earth did not end. As a consequence, less-involved members generally accepted that their belief was false. But more-involved members instead insisted that the actions of the cult showed their faith, which was then rewarded by the Earth not ending.

    Obviously it is difficult to use logic to address issues of faith, but science is not a faith and does not require belief. Therefore, when we find that current scientific positions are wrong, they can be changed with only minor discomfort and anguish. Supposedly. (Also see mere semantics and old wives' tale.)

    Combination
    In mathematics, the term for any particular subset of symbols, independent of order. (Also called the binomial coefficient.) The number of combinations of n things, taken k at a time, read "n choose k" is:
        n
       ( ) = C(n,k) =  n! / (k! (n-k)!)
        k
    
    Also,
        n     n           n
       ( ) = ( ) = 1     ( ) = n
        0     n           1
    

    See the combinations section of the "Base Conversion, Logs, Powers, Factorials, Permutations and Combinations in JavaScript" page (locally, or @: http://www.ciphersbyritter.com/JAVASCRP/PERMCOMB.HTM#Combinations). Also see permutation.

    Combinatoric
    In mathematics, combinatorics is related to counting selections, arrangements or other subsets of finite sets. One result is to help us understand the probability of a particular subset in the universe of possible values.

    Consider a conventional block cipher: For any given size block, there is some fixed number of possible messages. Since every enciphering must be reversible (deciphering must work), we have a 1:1 mapping between plaintext and ciphertext blocks. The set of all plaintext values and the set of all ciphertext values is the same set; particular values just have different meanings in each set.

    Keying gives us no more ciphertext values, it only re-uses the values which are available. Thus, keying a block cipher consists of selecting a particular arrangement or permutation of the possible block values. Permutations are a combinatoric topic. Using combinatorics we can talk about the number of possible permutations or keys in a block cipher, or in cipher components like substitution tables.

    Permutations can be thought of as the number of unique arrangements of a given length on a particular set. Other combinatoric concepts include binomials and combinations (the number of unique given-length subsets of a given set).

    Combiner
    In a cryptographic context, a combiner is a mechanism which mixes two data sources into a single result. A "combiner style cipher" refers to a stream cipher.

    Reversible combiners are pretty much required to encipher plaintext into ciphertext in a stream cipher. The ciphertext is then deciphered into plaintext using a related inverse or extractor mechanism. The classic examples are the stateless and strengthless linear additive combiners, such as addition, exclusive-OR, etc.

    Reversible and nonlinear keyable combiners with state are a result of the apparently revolutionary idea that not all stream cipher security need reside in the keying sequence. Examples include:

    Irreversible or non-invertible combiners are often proposed to mix multiple RNG's into a single confusion sequence, also for use in stream cipher designs. But that is harder than it looks. For example, see:

    Also see balanced combiner, complete, and also "The Story of Combiner Correlation: A Literature Survey," locally or @: http://www.ciphersbyritter.com/RES/COMBCORR.HTM.

    Common Mode
    In electronics, typically an identical signal on both conductors of a balanced line. A transformer winding across that line would ignore common mode signals. As opposed to differential mode.

    Commutative
    In abstract algebra, a dyadic operation in which exchanging the two argument values must produce the same result: a + b = b + a.

    Also see: associative and distributive.

    Complete
    A term used in S-box analysis to describe a property of the value arrangement in an invertible substitution or, equivalently, a conventional block cipher. If we have some input value, and then change one bit in that value, we expect about half the output bits to change; this is the result of diffusion; when partial diffusion is repeated we develop avalanche; and the ultimate result is strict avalanche. Completeness tightens this concept and requires that changing a particular input bit produce a change in a particular output bit, at some point in the transformation (that is, for at least one input value). Completeness requires that this relationship occur at least once for every combination of input bit and output bit. It is tempting to generalize the definition to apply to multi-bit element values, where this makes more sense.

    Completeness does not require that an input bit change an output bit for every input value (which would not make sense anyway, since every output bit must be changed at some point, and if they all had to change at every point, we would have all the output bits changing, instead of the desired half). The inverse of a complete function is not necessarily also complete.

    As originally defined in Kam and Davida:

    "For every possible key value, every output bit ci of the SP network depends upon all input bits p1,...,pn and not just a proper subset of the input bits." [p.748] -- Kam, J. and G. Davida. 1979. Structured Design of Substitution-Permutation Encryption Networks. IEEE Transactions on Computers. C-28(10):747-753.

    Complex Number
    An ordered pair of real numbers (x,y) treated as the vector from the origin at (0,0) to (x,y), and represented either as real and imaginary rectangular coordinates (x,y) or as the magnitude and angle (mag,ang) of the vector. Denoted C. The rectangular representation is also called "Cartesian"; the magnitude and angle form is called "polar."

    To build an appropriate algebra and make complex numbers a field, the rectangular representation is written as (x+iy) [or (x+jy)], where i [or j] has the value SQRT(-1). The symbol i is called "imaginary," but we might just consider it a way for the algebra to relate the values in the ordered pair. Clearly, i * i = -1.

    With appropriate rules like:

    addition:        (a+bi) + (c+di) = (a+c) + (b+d)i
    multiplication:  (a+bi) * (c+di) = (ac-bd) + (bc+ad)i
               c+di   ac+bd    ad-bc
    division:  ---- = ----- + (-----)i
               a+bi   aa+bb    aa+bb
    
    we get complex algebra, and can perform most operations and even evaluate trignometric and other complex functions like we do with reals.

    In cryptography, perhaps the most common use of complex numbers occurs in the FFT, which typically transforms values in rectangular form. Sometimes we want to know the magnitude or length of the implied vector, which we can get by converting the rectangular (x,y) representation into the (mag,ang) representation:

    magnitude:   mag(z) = SQRT( x*x + y*y )
    angle:       ang(z) = arctan( y / x)
    
    Note: Computer arctan(x) functions are generally unable to
      place the angle in the proper quadrant, but arctan2(x,y)
      routines -- with two input parameters -- may be available
      to do so.
    

    Component
    A part of a larger construction; a building-block in an overall design or system. Modern digital design is based on the use of a few general classes of pre-defined, fully-specified parts. Since even digital logic can use or even require analog values internally, by enclosing these values the logic component can hide complexity and present the appearance of a fully digital device.

    The most successful components are extremely general and can be used in many different ways. Even as a brick is independent of the infinite variety of brick buildings, a flip-flop is independent of the infinite variety of logic machines which use flip-flops.

    The source of the ability to design and build a wide variety of different electronic logic machines is the ability to interconnect and use a few very basic but very general parts.

    Electronic components include

    The use of individual components to produce a working complex system in production requires: first, a comprehensive specification for each part; and next, full testing to guarantee that each part actually meets the specification (see: quality management).

    Digital logic is normally specified to operate correctly over a range of supply voltage, temperature, loading, clock rates, and other appropriate parameters. Specified limits (minimum's or maximum's) guarantee that a working part will operate correctly even with the worst case of all parameters simultaneously. This process allows large, complex systems to operate properly in practice, provided the designer makes sure that none of the parameters can exceed their correct range.

    Cryptographic system components include:

    Composite
    An integer which is not prime, and is, therefore, a product of at least two primes. See factor.

    Compression
    See data comprression.

    Compromise
    To expose a secret.

    Computer
    Originally the job title for a person who performed a laborious sequence of arithmetic computations. Now a machine for performing such calculations.

    A logic machine with:

    1. Some limited set of fundamental computations. Typical operations include simple arithmetic and Boolean logic. Each operation is selected by a particular operation code value or "opcode." This is a hardware interpretation of the opcode.
    2. The ability to follow a list of instructions or commands, performing each in sequence. Thus capable of simulating a wide variety of far more complex "instructions."
    3. The ability to execute or perform at least some instructions conditionally, based on parameter values or intermediate results.
    4. The ability to store values into a numbered "address space" which is far larger than the instruction set, and later to recover those values when desired.

    The general model of mechanical computation is the finite state machine, which is absolutely deterministic and, thus, predictable.

    Also see: source code, object code, software, system design, Software Engineering and Structured Programming.

    COMSEC
    The military abbreviation for "communications security." Communications security includes:

    Conclusion
    In the study of logic, the last statement in an argument, deduced from the preceeding premises.

    Condition
    A requirement for a statement to be true. A limitation on statement universal applicability. Also see: assumption and premise.

    Conductor
    A material in which electron flow occurs easily. Typically a metal; usually copper, sometimes silver, brass or aluminum. A wire. As opposed to an insulator and a semiconductor.

    As a rule of thumb, a cubic centimeter (cc) of a solid has about 1024 or 1E24 atoms. In a metal, usually each atom contributes one or two electrons, so a metal has about 1024 (1E24) free electrons per cc. This massive number of free electrons has a tiny resistance to current flow of something like 10-6 ohms across a cubic centimeter of copper, or about one microhm per cm3. Apparently the International Annealed Copper Standard (IACS) says that annealed copper with a cross sectional area of a square centimeter should have a resistance of about 1.7241 microhms/cm (at 20 degrees Celsius), which is satisfactorily close.

    A cube with one millimeter sides has 1/100 the cross sectional area of a centimeter cube (and is about like AWG 17 wire), and so would have 100x the resistance per cm., but also is only 1/10 the length, for about 17 microhms per millimeter copper cube. A meter of AWG 17 wire would have 1000 millimeter-size cubes at 17 microhms each, so we would expect it to have about 17 milliohms total resistance. As a check, separate wire tables give the resistance of AWG 17 at 5.064 ohms per 1000ft (304.8m), which is 0.017 ohms (17 milliohms) per meter.

    RESISTANCE RELATIVE TO COPPER (cm cube = 1.7241 microhms)
                      Resist  Temp Coef  Thermal Cond  Melts (deg C)
    Silver (Ag)        0.95     0.0038       4.19       960.5
    Copper (Cu)        1.00     0.00393      3.88      1083
    Gold (Au)          1.416    0.0034       2.96      1063
    Aluminum (Al)      1.64     0.0039       2.03       660
    Bronze (Cu+Sn)     2.1        ---         ---      1280
    Tungsten (W)       3.25     0.0045       1.6       3370
    Zinc (Zn)          3.4      0.0037       1.12       419
    Brass (Cu+Zn)      3.9      0.002        1.2        920
    Nickel (Ni)        5.05     0.0047       0.6       1455
    Iron (Fe)          5.6     ~0.005        0.67      1535
    Tin (Sn)           6.7      0.0042       0.64       231.9
    Chromium (Cr)      7.6        ---         ---      2170
    Steel (Fe+C)     ~10          ---        0.59      1480
    Lead (Pb)         12.78     0.0039       0.344      327
    Titanium (Ti)     47.8        ---        0.41      1800
    Stainless (-->)   52.8        ---        0.163     1410   (Fe+Cr+Ni+C)
    Mercury (Hg)      55.6      0.00089      0.063      -38.87
    Nichrome (Ni+Cr)  65        0.00017      0.132     1350
    Graphite (C)     590          ---         ---      3800
    Carbon (C)      2900       -0.0005        ---      3500
    

    Confidential
    Information intended to be secret.

    Confusion
    Those parts of a cipher mechanism which change the correspondence between input values and output values. In contrast to diffusion.

    Confusion Sequence
    The sequence combined with data in a conventional stream cipher. Normally produced by a random number generator, it is also called a running key or keystream.

    Conjecture
    A statement which is hoped to be true. A proposed theorem. Also see: proposition.

    Consequent
    In the study of logic, the then-clause of an if-then statement. Also see: antecedent

    Conspiracy
    1. A secret agreement among like-minded individuals to work toward a particular hidden goal despite outward appearances.
    2. Legally, an agreement to break the law.

    In a conspiracy, multiple individuals can each contribute a minor action to accumulate a large effect. One obvious approach is to use gossip to give the impression that all right-thinking people are against some one or some thing. A conspiracy can be difficult to oppose, because a major effect can be achieved with minor actions that individually do not call for a major response.

    Constant
    In mathematics and computing, a symbol or explict value which does not or cannot change during the operation. As opposed to a variable. Also see: expression and equation.

    Contextual
    In the study of logic, an observed fact dependent upon other facts not being observed. Or a statement which is conditionally true, provided other unmentioned conditions have the appropriate state. As opposed to absolute.

    Contradiction
    In the study of logic, an expression which is always false and cannot be true. In contrast to tautology. Also see counterexample and paradox.

    Conventional Block Cipher
    That proper subset of block ciphers which emulates a huge simple substitution or a huge bijection. Also see block cipher and stream cipher models.

    Conventional Cipher
    A secret key cipher.

    Congruence
    Casually speaking, the remainder after a division of integers.

    In number theory we say than integer a (exactly) divides integer b (denoted a | b) if and only if there is an integer k such that ak = b.

    In number theory we say that integer a is congruent to integer b modulo m, denoted a = b (mod m), if and only if m | (a - b). Here m is the divisor or modulus.

    Conventional Current Flow
    In electronics, the idea that current flow occurs in the direction opposite to the direction of electron flow. This occurs because we assign a negative charge to electrons. Conventional current flow occurs from anode to cathode within a device, and from the cathode of one device to the anode of another. For example, conventional current flow starts at the cathode or (+) end of a battery, connects to the anode of a component and flows out the cathode, which connects to the anode or (-) end of the battery.

    Convolution
    Polynomial multiplication. A multiplication of each term against each other term, with no "carries" from term to term. Also see correlation.

    Used in the analysis of signal processing to develop the response of a processing system to a complicated real-valued input signal. The input signal is first separated into some number of discrete impulses. Then the system response to an impulse -- the output level at each unit time delay after the impulse -- is determined. Finally, the expected response is computed as the sum of the contributions from each input impulse, multiplied by the magnitude of each impulse. This is an approximation to the convolution integral with an infinite number of infinitesimal delays. Although originally accomplished graphically, the process is just polynomial multiplication.

    It is apparently possible to compute the convolution of two sequences by taking the FFT of each, multiplying these results term-by-term, then taking the inverse FFT. While there is an analogous relationship in the FWT, in this case the "delays" between the sequences represent mod 2 distance differences, which may or may not be useful.

    Copyright
    A U.S. federal (and worldwide) right by which the owner of a creative work can prevent others from copying and thus stealing that work. Copyright covers "original works of authorship" that are "fixed in a tangible form of expression," including: books, pamphlets, musical scores and plays; pictures, graphics and sculpture; movies, audio recordings and architecture. Included as literary works are computer program source code, and compilations of existing material, facts or data, which may include even simple lists of facts, when produced by creative selection. A copyright owner has the exclusive right to: reproduce the work, to derive subsequent works, to distribute copies, and to perform or display the work. Copyright is an intellectual property right; also see plagiarism.

    Copyright protects a particular expression, but not the underlying idea, process or function it may perform, which is the province of patent protection. Copyright protects form, not content: Copyright can protect particular text and diagrams, but not the described concept. In general, copyright comes into existence simply by creating a picture or manuscript or making a selection; theoretically, no notice or registration is required. (See the Library of Congress circular "Copyright Basics": http://www.loc.gov/copyright/circs/circ1.html#cr). However, formal registration is required before a lawsuit can be filed, and registration within 3 months of publication supports recovery of statutory damages and attorney fees; otherwise, apparently only actual damages can be recovered. Similarly, no copyright notice is required, but having one like this:

    Copyright 1991 Terry Ritter. All Rights Reserved.
    may avoid an "innocent infringement" defense. Protection currently lasts 70 years beyond the death of the author, or 95 years from date of publication for works for hire. Copyright is not handled by the PTO but instead by the United States Copyright Office (http://lcweb.loc.gov/copyright/) in the Library of Congress.

    Corollary
    An incidental result from the proof of a theorem. Also see: lemma.

    Correlation
    1. A statistical relationship, not necessarily linear, typically between two variables. A co-relation (Galton, 1869).
    2. The probability that two sequences of symbols will, in any position, have the same symbol. (We expect two random binary sequences to have the same bit values about half the time.)
    3. The general idea that symbols or sequences of symbols will have some non-random relationship in value. For example, each new bit in an LFSR sequence is perfectly correlated to some computation involving earlier bits in the sequence. (Also see independent and rule of thumb.)

    One way to evaluate a common correlation of two real-valued sequences is to multiply them together term-by-term and sum all results. If we do this for all possible "delays" between the two sequences, we get a "vector" or 1-dimensional array of correlations which is a convolution. Then the maximum value represents the delay with the best simple correlation.

    Correlation Coefficient
    A statistic or measure of simple linear correlation between two binary sequences. Correlation coefficient values range from -1 to +1 and are related to the probability that, given a symbol from one sequence, the other sequence will have that same symbol. A value of:
    • -1 implies a 0.0 probability (the second sequence is the complement of the first),
    • 0 implies a 0.5 probability (no simple correlation found)
    • +1 implies a 1.0 probability (the sequences are the same).
    "The correlation coefficient associated with a pair of Boolean functions f(a) and g(a) is denoted by C(f,g) and is given by C(f,g) = 2 * prob(f(a) = g(a)) - 1 ." -- Daemen, J., R. Govaerts and J. Vanderwalle. 1994. Correlation Matrices. Fast Software Encryption. 276. Springer-Verlag.

    Counterexample
    An exception to a conjecture. A statement which meets the requirements or premises of a conjecture, but is known to have a different conclusion. Or a true statement which contradicts some part of the conjecture. A statement which refutes the proof of a lemma or theorem.

    There are two classes: A local counterexample refutes a lemma, but not necessarily the main conjecture. A global counterexample refutes the main conjecture.

    Counter Mode
    That operating mode of a block cipher in which a count value is enciphered and the result used for some ciphering purpose. Typically, the counter and cipher are used as an RNG or confusion sequence generator in stream cipher systems.

    Note that integer counting produces perhaps the best possible signal for investigating block cipher deficiencies in the rightmost bits. Accordingly, incrementing by some large random constant, or using some sort of LFSR or other polynomial counter which changes about half its bits on each step may be more appropriate.

    Counting Number
    The set: {1, 2, 3, ...}. The positive integers. The natural numbers without zero.

    Covariance
    In statistics, a measure of the extent to which random variable X will predict the value of random variable Y. When X and Y are independent, the covariance is 0. When X and Y are linearly related, the covariance squared is the variance of X times the variance of Y.

    Coverage
    The affected proportion of the whole. In fault tolerance, the probability a system failure is prevented, given a particular fault. In testing, the tested proportion of the whole specification.

    CRC
    Cyclic Redundancy Check: A fast error detecting hash based on mod 2 polynomial operations for speed and simplicity. Also see checksum.

    CRC error-checking is widely used in practice to check the data recovered from magnetic storage. When data are written to disk, a CRC of the original data is computed and stored along with the data itself. When data are recovered from disk, a new CRC is computed from the recovered data and that result compared to the recovered CRC. If the CRC's do not match, we have a "CRC error."

    Computer disk-read operations always have some chance of a "soft error" which does not re-occur when the same sector is re-read, so the usual hardware response is to try again, some number of times. If that does not solve the problem, the error may be reported to the user and could indicate the start of serious disk problems.

    A CRC operation is essentially a remainder over the huge numeric value which is the data; the mod 2 polynomials make this "division" both faster and simpler than one might expect. Related techniques like integer or floating point division can have similar power, but are unlikely to be as simple. In general, "division" techniques only miss errors which are some product of the divisor, and so n-bit codes miss only 1 out of every 2n (that is, 2**n) possible errors, on average. Earlier techniques, such as checksum are significantly less able to detect errors in real data.

    The CRC result is an excellent (but linear) hash value corresponding to the data. Compared with other hash alternatives, CRC's are simple and straightforward. They are well-understood. They have a strong and complete basis in mathematics, so there can be no surprises. CRC error-detection is mathematically tractable and provable without recourse to unproven assumptions. And CRC hashes do not need padding. None of this is true for most cryptographic hash constructions.

    For error-detection, the CRC register is first initialized to some fixed value known at both ends, nowadays typically "all 1's." Then each data element is processed, each of which changes the CRC value. When all of the data have been processed, the CRC result is sent or stored at the end of the data. Frequently the CRC result first will be complemented, so that a CRC of the data and the complemented result will produce a fixed "magic number." This allows efficient hardware error-checking, even when the hardware does not know how large the data block will be in advance. (Typically, the end of transmission, after the CRC, is indicated by a hardware "done" signal.)

    Nowadays, CRC's are often computed in software which is generally more efficient with larger data quantities. Thus we see 8-bit, 16-bit or 32-bit data elements being processed. However, CRC's can be computed on individual data bits, and on records of arbitrary bit length, including zero bits, one bit, or any uneven or dynamic number of bits. As a consequence, no padding is ever needed for CRC hashing.

    Here is a code snippet for a single-bit left-shift CRC operation:

       if (msb(crc) == databit)
          crc = crc << 1;
       else
          crc = (crc << 1) ^ poly;
    
    This fragment needs to execute 8 times to compute the CRC for a full data byte. However, a better way to process a byte in software is to pre-compute a 256-element table representing every possible CRC change corresponding to a single byte. The table value is selected by a data byte XORed with the current top byte of the CRC register (in a left-shift implementation).

    In the late 60's and early 70's, the first CRC's were initialized as "all-0's." Then it was noticed that extra or missing 0-bits at the start of the data would not be detected, so it became virtually universal to init the CRC as "all-1's." In this case, extra or missing zeros at the start are detected, and extra or missing ones at the start are detected as well.

    It is possible for multiple errors to occur and the CRC result to end up the same as if there were no error. But unless the errors are introduced intentionally, this is very unlikely. Various common errors are detected absolutely, such as:

    • Any single bit added, anywhere.
    • Any single bit deleted.
    • Any single bit changed.
    • All "burst errors" (contiguous bits in error) of length smaller than the polynomial.

    If we have enough information, it is relatively easy to compute error patterns which will take a CRC value to any desired CRC value. Because of this, data can be changed in ways which will produce the original CRC result. Consequently, no CRC has any appreciable cryptographic strength, but some applications in cryptography need no strength:

    • One example is key processing, where the uncertainty in a User Key phrase of arbitrary size is collected into a hash result of fixed size. In general, the hash result would be just as good for the opponent as the original key phrase, so no strength shield could possibly improve the situation.
    • Another example is the hash accumulation of the uncertainty in slightly uncertain physically random or really random events. When true randomness is accumulated, it is already as unknowable as any strength shield could make it.

    On the other hand, a CRC, like most computer hashing operations, is normally used so that we do not have "enough information." When substantially more information is hashed than the CRC can represent, any particular CRC result will be produced by a vast number of different input strings. In this way, even a linear CRC can be considered an irreversible "one way" or "information reducing" transformation. Of course, when a string shorter than the CRC polynomial is hashed, it should not be too difficult to find the one string that could produce any particular CRC result.

    The CRC polynomial need not be particularly special. Unlike the generator polynomials used in LFSR's, a CRC poly need not be primitive nor even irreducible. Indeed, the early 16-bit CRC polys were composite with a factor of "11" which is equivalent to the information produced by a parity bit. (Since parity was the main method of error-detection at the time, the "11" factor supported the argument that CRC was better.) However, modern CRC polys generally are primitive, which allows the error detection guarantees to apply over larger amounts of data. It also allows the CRC operation to function as an RNG. But the option exists to use secret random polynomials to detect errors without being as predictable as a standard CRC. Polynomial division does not require mathematical structure (such as an irreducible or primitive), beyond the basic mod 2 operations.

    Different CRC implementations can shift left or right, take data lsb or msb first, and be initialized as zeros or ones, each option naturally producing different results. Various CRC standards specify different options. Obviously, both ends must do things the same way, but it is not necessary to conform to a standard to have quality error-detection for a private or new design. Variations in internal handling can make a CRC with one set of options produce the same result as a CRC with other options.

    When the logical complement of a CRC result is appended to the data and processed msb first, the CRC across that data and the result produces a "magic" value which is a constant for a particular poly and set of options. In general, the sequence reverse of a good poly is also a good poly, and there is some advantage to having CRC polys which are about half 1's. In some notations we omit the msb which is always 1 (as is the lsb). For notational convenience, we can write x3 + x2 + x + 1 as: 3,2,1,0. These are the positions of 1-bits in the poly. Common CRC polys include:

       Name     Hex        Set Bits
    
       CRC16    8005       16,15,2,0
       CCITT    1021       16,12,5,0
       CRC24a   800063
       CRC24b   800055
       CRC24c   861863
       CRC32    04c11db7   32,26,23,22,16,
                           12,11,10,8,7,5,4,2,1,0
       SWISS-PROT          64,4,3,1,0
       SWISS-PROT Impr     64,63,61,59,58,56,55,52,49,48,
       (D. Jones)          47,46,44,41,37,36,34,32,
                           31,28,26,23,22,19,16,
                           13,12,10,9,6,4,3,0
    

    Also see: reverse CRC, and

    Crib
    In cryptanalysis, a small amount of known plaintext which is assumed or guessed to help attack a cipher.

    CRNG
    cryptographic random number generator. A cryptographic RNG.

    CRT
    1. Cathode Ray Tube. The large vacuum tube with phosphor display which is still vastly the most common television and computer monitor display.
    2. In mathematics, Chinese Remainder Theorem.

    Cryptanalysis
    That aspect of cryptology which concerns the strength of a cryptographic system, and the penetration or breaking of a cryptographic system. Also known as codebreaking, cryptanalysis is first of all analysis, the picking apart of a cipher and/or cipher system to form a deep understanding of the operations and their consequences. Also see: cryptography war and traffic analysis. Also see: risk analysis, tree analysis, fault tree analysis and threat model. Also see: hypothesis.

    In normal cryptanalysis we start out knowing plaintext, ciphertext, and cipher construction. The only thing left unknown is the key. A practical attack must recover the key. (Or perhaps we just know the ciphertext and the cipher, in which case a real attack would recover plaintext.) Simply finding a distinguisher (showing that the cipher differs from the chosen model) is not, in itself, an attack or break.

    What Makes a Cipher "Strong"?

    Because no theory guarantees strength for any conventional cipher (see, for example, the one time pad and proof), ciphers traditionally have been considered "strong" when they have been used for a long time with "nobody" knowing how to break them easily.

    Expecting cipher strength because a cipher is not known to have been broken is the logic fallacy of ad ignorantium: a belief which is claimed to be true because it has not been proven false.

    Cryptanalysis seeks to extend this admittedly-flawed process by applying known attack strategies to new ciphers (see heuristic), and by actively seeking new attacks. Unfortunately, real attacks are directed at particular ciphers, and there is no end to different ciphers. Even a successful break is just one more trick from a virtually infinite collection of unknown knowledge.

    In cryptanalysis it is normal to assume that at least known-plaintext is available; often, defined-plaintext is assumed. The result is typically some value for the amount of work which will achieve a break (even if that value is impractical); this is the strength of the cipher under a given attack. Different attacks on the same cipher may thus imply different amounts of strength. While cryptanalysis can demonstrate "weakness" for a given level of effort, cryptanalysis cannot prove that there is no simpler attack (see, for example, attack tree and threat model):

    Lack of proof of weakness is not proof of strength.

    Indeed, when ciphers are used for real, the opponents can hardly be expected to advertise a successful break, but will instead work hard to reassure users that their ciphers are still secure. The fact that apparently "nobody" knows how to break a cipher is somewhat less reassuring from this viewpoint. (Also see the discussion: "The Value of Cryptanalysis," locally, or @: http://www.ciphersbyritter.com/NEWS3/MEMO.HTM). For this reason, using a wide variety of different ciphers can make good sense: That reduces the value of the information protected by any particular cipher, which thus reduces the rewards from even a successful attack. Having numerous ciphers also requires the opponents to field far greater resources to identify, analyze, and automate breaking (when possible) of each different cipher. Also see: Shannon's Algebra of Secrecy Systems.

    How Are Ciphers Cracked?

    In general, a cipher can be seen as a key-selected set of transformations from plaintext to ciphertext (and vise versa for deciphering). A conventional block cipher takes a block of data or a block value and transforms that into some probably different value. The result is to emulate a huge, keyed substitution table, so, for enciphering, we can write this very simple equation:

        E[K][PT] = CT,  or  E[K,PT] = CT
    
    where PT = plaintext block value, K = Key, and CT = ciphertext block value. The brackets "[ ]" mean the operation of indexing: the selection of a particular position in an array and returning that element value. Here, E[K] represents a particular, huge, emulated encryption "table," while E[K][PT] selects a single entry (a block value) from that table. So we can recover the plaintext with:
        D[K][CT] = PT,  or  D[K,PT] = CT
    
    where D[K] represents an inverse or decryption table. But an attacker does not know and thus must somehow develop the decryption table.

    We assume that an opponent has collected quite a lot of information, including lots of plaintext and the associated ciphertext (a condition we call known plaintext). The opponent also has a copy of the cipher and can easily compute every enciphering or deciphering transformation. What the opponent does not have, and what he is presumably looking for, is the key. The key would expose the myriad of other ciphertext block values for which the opponent has no associated plaintext.

    We might imagine the opponent attacking a cipher with a deciphering machine having a huge "channel-selector" dial to select a key value. As one turns the key-selector, each different key produces a different deciphering result on the display. So all the opponent really has to do is to turn the key dial until the plaintext message appears. Given this extraordinarily simple attack (known as brute force), how can any cipher be considered secure?

    In a real cipher, we make the key dial very, very, very big! The keyspace of a real cipher is much too big, in fact, for anyone to try each key in a reasonable amount of time, even with massively-parallel custom hardware. That leaves the opponent with a problem: brute force does not work.

    Nevertheless, the cipher equation seems exceedingly simple. There is one particular huge emulated table as selected by the key, and the opponent has a sizable set of positions and values from that table. Moreover, all the known and unknown entries are created by exactly the same mechanism and key. So, if the opponent can in some way relate the known entries to the rest of the table, thus predicting unknown entries, the cipher may be broken. Or if the opponent can somehow relate known plaintexts to the key value, thus predicting the key, the key may be exposed. And with the key, ciphertext for which there is no corresponding plaintext can be exposed, thus breaking the cipher. Finding these relationships is where the cleverness of the individual comes in. In a real sense, a cipher is a puzzle, and we currently cannot guarantee that there is no particular "easy" way for a smart team to solve it.

    One peculiarity of conventional block ciphers is that they cannot emulate all possible tables, but instead only a tiny, tiny fraction thereof (see block cipher). Even what we consider a huge key simply cannot select from among all possible tables because there are far too many. Now, the "tiny fraction" of tables actually emulated is still too many to traverse (this is a "large enough" keyspace), but, clearly, some special selection is happening which might be exploited. Having even one particular value at one known table position is sufficiently special that we expect that only one key would produce that particular relationship in a conventional cipher. So, in practice, just one known-plaintext pair generally should be sufficient to identify the correct key, if only we could find some way to do it.

    More Realistic Attacks

    In academic cryptanalysis we normally assume that we do not know the key, but do know the cipher and everything about it. We also assume essentially unlimited amounts of known plaintext to use in an attack to find the key. In practice things are considerably different.

    In practical cryptanalysis we may not know which cipher has been used. The cipher may not ever have been published, or may have been modified from the base version in various ways. Even a cipher we basically know may have been used in a way which will disguise it from us, for example:

    • The external key may have been custom processed or hashed and may not be the key the internal cipher uses. The key value could have been casually bit-reversed or byte-reversed or oddly padded or so on. Or the key itself could have been intentionally enciphered.
    • The message text or data could be changed in just as many different ways.
    So even if we have the right key, we can only show that by checking every cipher we know in every way it could have been used.

    Unknown Ciphers

    Selecting among different ciphers is part of Shannon's 1949 Algebra of Secrecy Systems. In a modern computer implementation, we could select ciphers dynamically. The number of selectable transformations increases exponentially when several ciphers are used in sequence (multiple encryption). Considering Shannon's academic work in this area, the use of well-known standardized designs is, ironically and rather sadly, current cryptographic orthodoxy (see risk analysis).

    The general mathematical model of ciphering is that of a keyed transformation (a mapping or function). Numerically, we can make the general model work for a system of multiple ciphers by allowing some "key" bits to select the cipher, with the rest of the key bits going to key that cipher. But in the adapted model, different parts of the key will have vastly different difficulties for the opponent. Finding the correct key within a cipher may be hard, yet could be much, much easier than finding the exact cipher actually being used. Differences in the difficulty of finding different key bits are simply glossed over in the adapted general model.

    Somehow obtaining and breaking every cipher which possibly could have been used is a vastly larger problem than the relatively small increase in keyspace indicated by the number of possible ciphers. For example, if we think we have found the key and want to check it, on a known cipher that has essentially no cost and may take a microsecond. But if we want to check the key on an unknown cipher, we first have to obtain that cipher. That may require the massive ongoing cost of maintaining an intelligence field service to obtain copies of secret ciphers. Once the needed cipher is obtained, finding a practical break may take experts weeks or months, if a break is even found. Taken together, this is a vast increase in difficulty for the opponent per cipher choice compared to the difficulty per key choice within a single cipher.

    Just as it may be impossible to try every 128-bit key at even a nanosecond apiece, it also may be impossible to keep up with a far smaller but continuing flow of new secret ciphers which take hundreds of billions of times longer to handle. This advantage seems to be exploited by NSA in keeping cipher designs secret (also see security through obscurity). Given the stark contrast of yet another real example which contradicts the current cryptographic wisdom, crypto academics continue to insist that standardizing and exposing the cipher design makes sense. Surely, exposing a cipher does support gratuitous analysis and help to expose some cipher weakness, but does not, in the end, give us a proven strong cipher. In the end, exposing the cipher may turn out to benefit opponents far more than users.

    In practice, an individual attacker mainly must hope that the cipher to be broken is flawed. An attacker can collect ciphertext statistics and hope for some irregularity, some imbalance or statistical bias that will identify the cipher class, or maybe even a well-known design. An attacker can make plaintext assumptions and see if some key will produce those words. But enciphering guessed plaintext seems an unlikely path to success when every possible cipher, and every possible modification of that cipher, is the potential encryption source. All this is a very difficult problem, and far different than the normal academic analysis.

    Other Weaknesses

    Many academic attacks are essentially theoretical, involving huge amounts of data and computation. But even when a direct technical attack is practical, that may be the most difficult, expensive and time-consuming way to obtain the desired information. Other methods include making a paper copy, stealing a copy, bribery, coercion, and electromagnetic monitoring. No cipher can keep secret something which has been otherwise revealed. Information security thus involves far more than just cryptography, and a cryptographic system is more than just a cipher (see: cipher system). Even finding that information has been revealed does not mean that a cipher has been broken, although good security virtually requires that assumption. (Of course, when we can use only one cipher, we cannot change ciphers anyway.)

    Unfortunately, we have no way to know how strong a cipher appears to our opponents. Even though the entire reason for using cryptography is a belief that our cipher has sufficient strength, science provides no basis for such belief. At most, cryptanalysis can give us only an upper limit to the strength of a cipher, which is not particularly helpful, and can only do that when a cipher actually can be broken. But when a cipher is not broken, cryptanalysis has told us nothing about the strength of a cipher, and unbroken ciphers are the only ones we use.

    Cryptanalytic Contributions and Limits

    The ultimate goal of cryptanalysis is not to break every possible cipher (that would be the end of an industry and also the end of new PhD's in the field). Instead, the obvious goal is understanding why some ciphers are weak, and why other ciphers seem strong. It is not much of a leap from that to expect cryptanalysts to work with, or at least interact with, cipher designers, with a common goal of producing better ciphers.

    Unfortunately, cryptanalysis is ultimately limited by what can be done: there are no ciphering techniques which guarantee strength, and there is no test which tells us how weak an arbitrary cipher really is. Accordingly, exposing a particular weakness in a particular cipher may be about as much as cryptanalysis can offer, even if that means a deafening silence about similar designs, ciphers which have been repaired, or significant cipher designs which remain both unbroken and undiscussed.

    Cryptanalyst
    Someone who attacks ciphers with cryptanalysis. A "codebreaker." Often called the opponent by cryptographers, in recognition of the (serious) game of thrust and parry between these parties.

    Crypto Controversies
    In cryptography, some supposedly technical disputes just seem to go on and on. Some topics have led to many hundreds of postings on Usenet sci.crypt in various conversations across multiple years. Clearly, those are controversial topics, whether academics think so or not. But it is also controversial to point out alternatives which conflict with current cryptographic wisdom or techniques. Dissent and disagreement are how controversies start.

    Apparent agreement among academics does not imply a lack of academic controversy, since many will side with the conventional wisdom, while others step back to consider the arguments. Since reality is not subject to majority rule, even universal academic agreement would not constitute a scientific argument, which instead requires facts and exposed logical reasoning. Controversy may even imply that academic cryptographers are unaware of the issue, or have not really considered it in a deep way. For if clear, understandable and believable explanations already existed, there would be little room for debate. Controversy arises when the given explanations are false, or obscure, or unsatisfactory.

    Scientific controversy is less about conflict than exposing Truth. That happens by doing research, then taking a stand and supporting it with facts and scientific argument. Many of these issues should have indisputable answers or expose previously ignored consequences. Wishy-washy statements like "some people think this, some think that," not only fail to inform, but also fail to frame a discussion to expose the real answer.

    Science Means Models

    One aspect of science is the creation of quantitative models which describe or predict reality. Since poor models lead to errors in reasoning, a science reacts to poor predictions by improving the models. In contrast, cryptography reacts by making excuses about why the model really is right after all, or does not apply, or does not matter. Examples include:

    • A common model for a conventional block cipher is "a family of permutations." But that family is huge, and in practice almost none of that family can be selected. A better model would be "a tiny subset of a family of permutations," which would lead immediately to questions about subset size and structure. Absent a realistic model, such questions rarely arise, and are even more rarely addressed.
    • The one time pad (OTP) is often said to be "proven secure." But NSA has in fact broken OTP's in practice, as we know from their description of VENONA. In exactly what way can cryptography claim that "proven secure" has any practical meaning, when using a "proven secure" cipher can result in death by execution from cipher failure? If "proven secure" is intended to be only theoretical, where is the practical analysis we need?
    • Cryptanalysis is sometimes said to be how we know our ciphers are strong, but that is false. The best cryptanalysis can do is find a problem, but not finding a problem does not make a cipher "strong," only "not known to be weak." That is not mere semantics, but instead sets absolute limits on cryptanalysis as a source of knowledge about strength. It also implies that cryptography must do something different just to get the knowledge of strength that most people think we already have. By not exposing this limitation, cryptography encourages students, buyers and users to form the conceptual model that mathematics can deliver guaranteed strength in practice when in reality that is almost never possible. And it is similarly impossible to know the probability of failure.

    In my view, cryptography has presided over a fundamental breakdown in logic, perhaps created by awe of supposedly superior mathematical theory. Upon detailed examination, however, theoretical math often turns out to be inapplicable to the case at hand. Practical results which conflict with theory are ignored or dismissed, even though confronting reality is how science improves models. Demanding belief in conventional cryptographic wisdom requires people to think in ways which accept logical falsehood as truth, and then they apply that lesson. Reasoning errors have become widespread, accepted, and prototypes for future thought. Disputes in cryptography are commonly argued with logic fallacies, and may be "won" with arguments that have no force at all. Since experimental results are rare in cryptography, we cannot afford to lose reason, because that is almost all we have.

    One Cipher is Not Enough

    A major logical flaw in conventional cryptography is the belief that one good cipher is enough.

    But since cipher strength occurs only in the context of our opponents, how could we ever know that we have a "good" cipher, or how "good" it is?

    In particular:

    1. There is no way to measure strength for an arbitrary practical cipher.
    2. Theoretical strength "proofs" almost never apply in practice.

    So if we cannot measure "good," and cannot prove "good," then exactly how do we know our ciphers are "good?" The answer, of course, is that we do not and can not know any such thing. In fact, nobody on our side can know that our ciphers are "good," no matter how well educated, experienced or smart they may be, because that is determined by our opponents in secret. Anyone who feels otherwise should try to put together what they see as a scientific argument to prove their point.

    When conventional cryptography accepts a U.S. Government standard cipher as "good" enough, there is no real need:

    • to research and develop fundamentally new cipher designs;
    • to innovate ways to measure strength;
    • to certify multiple ciphers;
    • to allow users to choose their cipher;
    • to change ciphers from time to time; and
    • there is certainly no need for ever-so-costly multiple encryption.

    Conventional cryptography encourages a belief in known cipher strength, thus ignoring both logic and the lessons of the past. That places us all at risk of cipher failure, which probably would give no indication to the user and so could be happening right now. That we have no indication of any such thing is not particularly comforting, since that is exactly what our opponents would want to portray, even as they expose our information.

    When an ordinary person makes a claim, they can be honestly wrong. But when a trained expert in the field makes a claim that we know cannot be supported, and continues to make such claims, we are pretty well forced into seeing that as either professional incompetence or deliberate deceit. Encouraging people to use only one cipher by claiming they need nothing else is exactly what one would expect from an opponent who knows how to break the cipher. Maybe that is just coincidence.

    A List of Crypto Controversies

    Cryptographic controversies include:

    • AES: AES is the new conventional block cipher standard. However, like most modern block ciphers, it uses only a breathtakingly small portion of the possible keyspace. And while that may not be a known weakness, it is at least disturbing.
    • BB&S: Blum, Blum and Shub is the way we refer to a well known article which describes a very slow random number generator which is supposedly "provably secure." Current crypto texts generally present a simplified version, but deceptively use the same BB&S name. That simplified construction has a very rare weakness related to choosing the initial value, a weakness that can be fixed at modest cost. As a consequence, a designer who wants assurance that every known fixable weakness has been eliminated will use the original construction. Confrontation arises when someone insists that no competent designer could have such a goal.
    • Bijective Compression: When random blocks or strings decompress into apparently grammatical text, it may be difficult to automatically distinguish the correct decryption from incorrect decryptions. The result is a potential increase in practical strength.
    • Block Cipher Definitions: Ordinarily, the person wishing to display an insight gets to set up the definitions, but of course communication requires both the writer and reader to have the same definition. Various block cipher definitions exist.
    • CBC First Block Problem: CBC requires an IV, typically sent with ciphertext. That IV is exclusive-ORed with the plaintext of the first block. So if opponents know the first block plaintext, and can intercept and change the IV, and the IV is sent as plaintext, the opponents can change the deciphered first block to any desired value. This is a form of man-in-the-middle attack. The usual advice is to use a MAC to authenticate the full deciphered message, and that works. But the fundamental problem is confidentiality, not authentication. A MAC is unnecessary if the IV is enciphered.
    • Cryptanalysis: Cryptanalysis is sometimes thought of as the way we know the strength of a cipher, but that is wrong. At its best, cryptanalysis can find out that a cipher is weak, and then we do not use that cipher. But cryptanalysis which finds no break does not mean that the cipher is strong, and we will use it anyway.
    • Data Compression: Data compression can increase the unicity distance while simultaneously decreasing the size of plaintext which must exceed that distance to be attacked.
    • Distinguisher: A distinguisher typically is some sort of statistical test that will show an imperfect distribution in, for example, a conventional block cipher. But by itself, that does not show weakness, and is no form of a break at all.
    • Dynamic Transposition: A class of unconventional block cipher with a particularly clear strength argument.
    • Entropy: Shannon's entropy measures coding efficiency. It produces the same result whether a sequence is known in advance or not. It produces the same result whether a sequence is predictable or not. Entropy does not measure unpredictability.
    • Huge Block Cipher Advantages: If we have a technology which allows the efficient construction of huge blocks, such blocks can have significant advantages.
    • Kerckhoffs' Requirements: Everybody thinks they already know what Kerckhoffs wrote.
    • Known Plaintext: Modern ciphers are sometimes thought to be invulnerable to known plaintext. But if we model a cipher as a mathematical function taking plaintext to ciphertext, known plaintext is how an attacker examines the function that the cryptographer wishes to hide.
    • Multiple Encryption: The idea is to add redundancy and so avoid the single point of failure represented by a single cipher.
    • Old Wives' Tales: Various delusions are widely accepted in the strange field of cryptography.
    • One Time Pad: The one time pad is often held up as the only example of a proven unbreakable cipher. Unfortunately, only the theoretical version is proven unbreakable, and that version only protects theoretical data. In practice there is no proven secure one time pad, because the required assumptions cannot be provably achieved in practice. Unfortunately, the crypto texts cause every crypto newbie to come to a different and wrong conclusion. That is first a failure of the field to properly model reality, and next a willingness to deceive those with the least crypto training and experience.
    • Proof: It is currently fashionable for cryptographic constructions to be claimed (and acclaimed) "provably secure." But for a proof to apply, every assumption it uses must be both known, and known to be true. For a proof to be effective for a user, that user must be able to first know every required assumption, and next show that each has been achieved in practice. Since few if any cryptosystems actually allow that, for users, "provably secure" generally is just another crypto deception, especially for those with the least crypto training and experience.
    • Randomness Testing: Randomness cannot be certified by test, just indicated. If absolute randomness is required, no test can tell us when we have it.
    • Really Random: Predictable statistical randomness is easily created, but unpredictable real randomness cannot be generated by a deterministic machine. Although various forms of "entropy" are supposedly available in a computer system, upon closer examination we may find that little unpredictability remains.
    • Risk: Ciphers may be the only product in modern manufacture which cannot be tested to their design goal. If the goal of a cipher is to keep secrecy, that can only be judged by opponents who work against us in secret and do not announce their results. Since we cannot know when our ciphers fail, we cannot begin to know the risk of cipher failure. Designers cannot know the outcome of their attempts to meet ciphering goals. That means there can be no expertise in cipher strength, by anyone, no matter how well educated or experienced. Even opponents only know their part of the truth.
    • Scalability: There is, and probably can be, no test to measure the strength of an arbitrary cipher. All ciphers are completely outside manufacturing control with respect to strength. An alternative is to develop designs which can be implemented at tiny size, and then exhaustively tested. Tiny ciphers will not be strong, but they can be rigorously tested in ways that large ciphers cannot. Scalability is an enabling technology which provides insight to strength, something sorely needed but not supported by conventional cryptography.
    • Snake Oil: Since there are no tests of cipher strength, certain rules of thumb have been used to indicate weakness. Those rules have problems.
    • Software Patent: There are no software patents; there are just patents which apply to software. There are no pure algorithm patents, but patents always have described machines which implement algorithms. In particular, process claims apparently have been granted almost from the beginning of U.S. patents. (But see a patent lawyer in any real situation.)
    • Strength: Crypto people talk about strength precisely the way that virgins talk about sex: with great enthusiasm and little understanding. There is and can be no expertise on cipher strength.
    • Term of Art: Terms used by various professions often mean something different than the exact same phrase used on its own. That can lead to endless difficulty in understanding and discussion, especially for those with the least training and experience.
    • Threat Model: Trying to estimate cipher strength by predicting attack possibilities is usually hopeless.
    • Trust: The value of trust inherently depends upon having a relationship with consequences should trust fail.
    • Unpredictability: One aspect of science is the construction of numerical models which "correctly predict" measurable outcomes. Similarly, cryptanalysis seeks to build models which "correctly predict" the unknown cipher key. The inability to construct a key-predicting scientific model is one way to see cipher strength.

    In my view, cryptography often does not understand or attempt to address controversial issues in a scientific way. In areas where cryptography cannot distinguish between truth and falsehood, it cannot advance.

    If anyone has any other suggestions for this list, please let me know.

    Cryptographer
    Someone who creates ciphers using cryptography.

    Cryptographic Hash
    A complex hash function in which deliberately producing any particular result is thought to be very difficult. (Also see one way hash and MAC.)

    Cryptographic Mechanism
    A process for enciphering and/or deciphering, or an implementation (for example, hardware, computer software, hybrid, or the like) for performing that process. See also cryptography and mechanism.

    Cryptographic Random Number Generator
    (CRNG). A random number generator (RNG) suitable for cryptographic use.

    Uses might include:

    Requirements for such an RNG will vary depending upon use, but might include:

    Cryptography
    Greek for "hidden writing." The art and science of transforming (encrypting) information (plaintext) into an intermediate form (ciphertext) which secures information in storage or transit. Normally, security occurs as a result of having a vast number of different transformations, as selected by some sort of key. Then, if an opponent acquires some ciphertext, a vast number of different plaintext messages presumably could have produced that exact same ciphertext, one for each of the possible keys.

    We normally assume that the opponents have a substantial amount of known plaintext to use in their work (see cryptanalysis). So the situation for the opponents involves taking what is known and trying to extrapolate or predict what is not known. That is similar to building a scientific model intended to predict larger reality on the basis of many fewer experiments. Since the whole idea is to make prediction difficult for the opponent, unpredictability can be called the essence of cryptography.

    Cryptography is a part of cryptology, and is further divided into secret codes versus ciphers. As opposed to steganography, which seeks to hide the existence of a message, cryptography seeks to render a message unintelligible even when the message is completely exposed.

    Cryptography includes at least:

    Cryptography may also include:

    In practice, cryptography should be seen as a system or game which includes both users and opponents: True scientific measures of strength do not exist when a cipher has not been broken, so users can only hope for their cipher systems to protect their messages. But opponents may benefit greatly if users can be convinced to adopt ciphers which opponents can break. Opponents are thus strongly motivated to get users to believe in the strength of a few weak ciphers. Because of this, deception, misdirection, propaganda and conspiracy are inherent in the practice of cryptography. (Also see trust and risk analysis.)

    And then, of course, we have the natural response to these negative possibilities, including individual paranoia and cynicism. We see the consequences of not being able to test cipher security in the arrogance and aggression of some newbies. Even healthy users can become frustrated and fatalistic when they understand cryptographic reality. Cryptography contains a full sea of unhealthy psychological states.

    Great Expectations in Cryptography

    For some reason (such as the lack of direct academic statements on the issue), some networking people who use and depend upon cryptography every day seem to have a slightly skewed idea about what cryptography can do. While they seem willing to believe that ciphers might be broken, they assume such a thing could only happen at some great effort. Apparently they believe the situation has been somehow assured by academic testing. But that belief is false.

    Ciphers are like puzzles, and while some ways to solve the puzzle may be hard, other ways may be easy. Moreover, once an easy way is found, that can be put into a program and copied to every "script kiddie" around. The hope that every attacker would have to invest major effort to find their own fast break is just wishful thinking. And even as their messages are being exposed, the users probably will think everything is fine, just like we think right now. Cipher failure could be happening to us right now, because there will be no indication when failure occurs.

    What are the chances of cipher failure? We cannot know! Ciphers are in that way different from nearly every other constructed object. Normally, when we design and build something, we measure it to see that it works, and how well. But with ciphers, we cannot measure how well our ciphers resist the efforts of our opponents. Since we have no way to judge effectiveness, we also cannot judge risk. Thus, we simply have no way to compare whether the cipher design is more likely to be weak than the user, or the environment, or something else. As sad as this situation may seem, it is what we have.

    When compared to the alternative of blissful ignorance, it should be a great advantage to know that ciphers cannot be depended upon. First, design steps could be taken to improve things (although that would seem to require a widespread new understanding of the situation that has always existed). Next, we note that ciphers can at most reveal only what they try to protect: When protected information is not disturbing, or dangerous, or complete, or perhaps not even true, exposure becomes much less of an issue.

    Keying in Cryptography

    Modern cryptography generally depends upon translating a message into one of an astronomical number of different intermediate representations, or ciphertexts, as selected by a key. If all possible intermediate representations have similar appearance, it may be necessary to try all possible keys (a brute force attack) to find the key which deciphers the message. By creating mechanisms with an astronomical number of keys, we can make this approach impractical.

    Keying is the essence of modern cryptography. It is not possible to have a strong cipher without keys, because it is the uncertainty about the key which creates the "needle in a haystack" situation which is conventional strength. (A different approach to strength is to make every message equally possible, see: Ideal Secrecy.)

    Nor is it possible to choose a key and then reasonably expect to use that same key forever. In cryptanalysis, it is normal to talk about hundreds of years of computation and vast effort spent attacking a cipher, but similar effort may be applied to obtaining the key. Even one forgetful moment is sufficient to expose a key to such effort. And when there is only one key, exposing that key also exposes all the messages that key has protected in the past, and all messages it will protect in the future. Only the selection and use of a new key terminates insecurity due to key exposure. Only the frequent use of new keys makes it possible to expose a key and not also lose all the information ever protected.

    Engineering in Cryptography

    Cryptography is not an engineering science: It is not possible to know when cryptography is "working," nor how close to not-working it may be:

    • There is no test that will give us the strength of an arbitrary cipher.
    • There is no "materials science" that will tell us how much strength to expect, given the component parts.
    • Academic strength proofs almost never apply in practice.
    • We do not know who is attacking, nor what their resources might be, nor how much success they have had.
    • There will be no exposure of failure from which we might develop even a handwave probability of weakness or risk.
    • Since we can expect to not know of failure, no amount of use without seeing failure can develop trust.
    (Also see: scientific method and threat model.)

    Cryptography may also be seen as a zero-sum game, where a cryptographer competes against a cryptanalyst. We might call this the cryptography war.

    Cryptography War
    Cryptography may be seen as a dynamic battle between cryptographer and cryptanalyst or opponent. The cryptographer tries to produce a cipher which can retain secrecy. Then, when it becomes worthwhile, one or more cryptanalysts may try to penetrate that secrecy by attacking the cipher. Fortunately for the war, even after fifty years of mathematical cryptology, not one practical cipher has been accepted as proven secure in practice. (See, for example, the one-time pad.)

    Note that the successful cryptanalyst must keep good attacks secret, or the opposing cryptographer will just produce a stronger cipher. This means that the cryptographer is in the odd position of never knowing whether his or her best cipher designs are successful, or which side is winning.

    Cryptographers are often scientists who are trained to ignore unsubstantiated claims. But the field of cryptography often turns the scientific method on its head, because almost never is there a complete proof of cryptographic strength in practice. In cryptography, scientists accept the failure to break a cipher as an indication of strength (that is the ad ignorantium fallacy), and then demand substantiation for claims of weakness. But there will be no substantiation when a cipher system is attacked and broken for real, while continued use will endanger all messages so "protected." Evidently, the conventional scientific approach of requiring substantiation for claims is not particularly helpful for users of cryptography.

    Since the scientific approach does not provide the assurance of cryptographic strength that users want and need, alternative measures become appropriate:

    • It can be a reasonable policy to not adopt a widely-used cipher, since such ciphers provide the best target for attackers and also the best reward for cryptoanalytic success.
    • Another reasonable policy is to change ciphers periodically, perhaps even on a message-by-message basis. This limits the extent of use of any weak cipher.
    • Yet another reasonable policy is to use multiple encryption as a matter of course (see Algebra of Secrecy Systems). Using three different ciphers on every message protects the message even if one of the ciphers has been broken in secret.

    Cryptology
    The field of study covering all forms of message protection and exposure. Typically thought to include:
    • steganography (methods which seek to conceal the presence of a message, such as patterns in graphics and secret inks)
    • cryptography (methods which translate a message into an intermediate form intended to hide information even when completely exposed)
    • cryptanalysis (methods which expose information hidden by cryptography)

    Sometimes also said to include:

    • message interception
    • traffic analysis (the use of transmission data, such as time of origin, call signs and message length, as a basis for intelligence)

    Cryptosystem
    A term which might be used to describe the overall function of ciphering; see: cipher system.

    It is especially important to consider the effect the underlying equipment has on the design. Even apparently innocuous operating system functions, such as the multitasking "swap file," can capture supposedly secure information, and make that available for the asking. Since ordinary disk operations generally do not even attempt to overwrite data on disk, but instead simply make that storage free for use, supposedly deleted data is, again, free for the asking. A modern cryptosystem will at least try to address such issues.

    Crystal
    1. In Physics, atoms arranged in a repeating structure.
    2. In electronics, typically a small part of a thin slice of a much larger quartz crystal, intended to regulate or filter a particular frequency. Typically a thin slab of translucent mineral, perhaps a quarter of an inch square, usually with metal terminals plated on each large side. Frequency preference occurs because of mechanical resonance -- an actual flexing of the crystalline rock itself -- which "rings" at a particular frequency.

    Quartz is a piezoelectric material, so a voltage across the terminals forces the quartz wafer to bend slightly, thus storing mechanical energy in physical tension and compression of the solid quartz. The physical mass and elasticity of quartz cause the wafer to mechanically resonate at a natural frequency depending on the size and shape of the quartz blank. The crystal will thus "ring" when the electrical force is released. The ringing will create a small sine wave voltage across electrical contacts touching the crystal, a voltage which can be amplified and fed back into the crystal, to keep the ringing going as oscillation.

    Crystals are typically used to make exceptionally stable electronic oscillators (such as the clock oscillators widely used in digital electronics) and the relatively narrow frequency filters often used in radio.

    It is normally necessary to physically grind a crystal blank to the desired frequency. While this can be automated, the accuracy of the resulting frequency depends upon the effort spent in exact grinding, so "more accurate" is generally "more expensive."

    Frequency stability over temperature depends upon slicing the original crystal at precisely the right angle. Temperature-compensated crystal oscillators (TCXO's) improve temperature stability by using other components which vary with temperature to correct for crystal changes. More stability is available in oven-controlled crystal oscillators (OCXO's), which heat the crystal and so keep it at a precise temperature despite ambient temperature changes.

    Crystal Oscillator
    In electronics, an oscillator in which the frequency is regulated by a quartz crystal. Typically used to produce very regular clock signals for digital electronics. In digital use, the analog sine-wave signal of the internal oscillator is digitized, and the square-wave result becomes the clock. Also see: specification.

    Sometimes suggested in cryptography as the basis for a TRNG, typically based on phase noise or frequency variations. But a crystal oscillator is deliberately designed for high frequency stability; it is thus the worst possible type of oscillator from which to obtain and exploit frequency variations. And crystal oscillator phase noise (which we see as edge jitter) is typically tiny and must be detected on a cycle-by-cycle basis, because it does not accumulate. Detecting a variation of, say, a few picoseconds in each 100nSec period of a typical 10 MHz oscillator is not something we do on an ordinary computer.

    Another common approach to a crystal oscillator TRNG is to XOR many such oscillators, thus getting a complex high-speed waveform. (The resulting digital signal rate increases as the sum of all the oscillators.) Unfortunately, the high-speed and asynchronous nature of the wave means that setup and hold times cannot be guaranteed to latch that data for subsequent use. (Latching is inherent in, say, reading a value from a computer input port.) That leads to statistical bias and possible metastable operation. Futher, the construction is essentially linear and may power up similarly each time it is turned on.

    Current
    The measure of electron flow, in amperes. A current of one amp is one coulomb per second, or about 6.24 x 1018 electrons per second. Conventional current flow is in the direction opposite to the flow of electrons, because electrons have been assigned a negative charge. The movement of electrons is a negative flow, which is generally equivalent to a positive flow in the opposite direction.

    Current is analogous to the amount of water flow, as opposed to pressure or voltage. A flowing electrical current will create a magnetic field around the conductor. A changing electrical current may create an electromagnetic field.

    Cycle
    1. Something which repeats over and over, like a wheel turning.
    2. One full period of a repetitive signal.
    3. In a FSM, a path which repeats endlessly. As opposed to an arc. In any FSM of finite size, every state sequence must eventually repeat or lead into a sub-sequence which repeats; every state eventually leads to a cycle.

    In some RNG constructions, (e.g., BB&S and the Additive RNG) the system consists of multiple independent cycles, possibly of differing lengths. Since having a cycle of a guaranteed length is one of the main requirements for an RNG, the possibility that a short cycle may exist and be selected for use can be disturbing.

    Cyclic Group
    A simple group in which every element is produced by one of the elements, and that element is called a generator.

    Cypher
    In the U.S., an overly-cute misspelling of cipher.

    Data
    1. Values, especially when coded for electronic representation.
    2. Facts, before being processed into "information." (The "information" then may be interpreted as "knowledge.")

    Data Compression
    The ability to represent data in forms which take less storage than the original. The limit to this is the amount of uniqueness in the data.

    Sometimes people claim that they have a method to compress a file, and that they can compress it again and again, until it is only a byte long. Unfortunately, it is impossible to compress all possible files down to a single byte each, because a byte can only select 256 different results. And while each byte value might represent a whole file of data, only 256 such files could be selected or indicated.

    Normally, compression is measured as the percentage size reduction; 60 percent is a good compression for ordinary text.

    In general, compression occurs by representing the most-common data values or sequences as short code values, leaving longer code values for less-common sequences. Understanding which values or sequences are more common is the "model" of the source data. When the model is wrong, and supposedly less-common values actually occur more often, that same compression may actually expand the data.

    Data compression is either "lossy," in which some of the information is lost, or "lossless" in which all of the original information can be completely recovered. Lossy data compression can achieve far greater compression, and is often satisfactory for audio or video information (which are both large and may not need exact reproduction). Lossless data compression must be used for binary data such as computer programs, and probably is required for most cryptographic uses.

    Compression and Encryption

    Compressing plaintext data has the advantage of reducing the size of the plaintext, and, thus, the ciphertext as well. Further, data compression tends to remove known characteristics from the plaintext, leaving a compressed result which is more random. Data compression can simultaneously expand the unicity distance and reduce the amount of ciphertext available which must exceed that distance to support attack. Unfortunately, that advantage may be most useful with fairly short messages. Also see: Ideal Secrecy.

    One goal of cryptographic data compression would seem to be minimize the statistical structure of the plaintext. Since such structure is a major part of cryptanalysis, that would seem to be a major advantage. However, we also assume that our opponents are familiar with our cryptosystem, and they can use the same decompression we use. So the opponents get to see the structure of the original plaintext simply by decompressing any trial decryption they have. And if the decompressor cannot handle every possible input value, it could actually assist the opponent by identifying wrong decryptions.

    When using data compression with encryption, one pitfall is that many compression schemes add recognizable data to the compressed result. Then, when that compressed result is encrypted, the "recognizable data" represents known plaintext, even when only the ciphertext is available. Having some guaranteed known plaintext for every message could be a very significant advantage for opponents, and unwise cryptosystem design.

    It is normally impossible to compress random-like ciphertext. However, some cipher designs do produce ciphertext with a restricted alphabet which can of course be compressed. Also see entropy.

    Bijective Compression

    Another possibility is to have a data decompressor that can take any random value to some sort of grammatical source text. That may be what is sometimes referred to as bijective compression. Typically, a random value would decompress into a sort of nonsensical "word salad" source text. However, the statistics of the resulting "word salad" could be very similar to the statistics of a correct message. That could make it difficult to computationally distinguish between the "word salad" and the correct message. If "bijective compression" imposes an attack requirement for human intervention to select the correct choice, that might complicate attacks by many orders of magnitude. The problem, of course, is the need to devise a compression scheme that decompresses random values into something grammatically similar to the expected plaintext. That typically requires a very extensive statistical model, and of course at best only applies to a particular class of plaintext message.

    An extension of the "bijective" approach would be to add random data to compressed text. Obviously, there would have to be some way to delimit or otherwise distinguish the plaintext from the added data, but that may be part of the compression scheme anyway. More importantly, the random data probably would have to be added between the compressed text in some sort of keyed way, so that it could not easily be identified and extracted. The keying requirement would make this a form of encryption. The result would be a homophonic encryption, in that the original plaintext would have many different compressed representations, as selected by the added random data. Having many different but equivalent representations allows the same message to be sent multiple times, each time producing a different encrypted result. But it is also potentially dangerous, in that the compressed message expands by the amount of the random data, which then may represent a hidden channel. Since, for encryption purposes, any random data value is as good as another, that data could convey information about the key and the user would never know. Of course, the same risk occurs in message keys or, indeed, almost any nonce.

    Data Fabrication
    The creation of false data. Intentional data fabrication for personal gain is a form of fraud. An issue of scientific integrity.

    Data Falsification
    The alteration of data, including selective omission. Intentional data falsification for personal gain is a form of fraud. An issue of scientific integrity.

    Data Security
    The extent to which data are secure. Protection of software and data from unauthorized modification, destruction, or disclosure. May or may not include cryptography.

    dB
    decibel.

    DC
    1. In cryptography, Differential Cryptanalysis.
    2. In electronics, direct current: Electrical power which flows in one direction, more or less constantly. As opposed to AC, which frequently reverses direction.

    Most electronic devices require DC -- at least internally -- for proper operation, so a substantial part of modern design is the "power supply" which converts 120 VAC wall power into 12 VDC, 5 VDC and/or 3 VDC as needed by the circuit and active devices.

    Debug
    The interactive analytical process of correcting bugs in the design of a complex system. A normal part of the development process.

    Contrary to naive expectations, a complex system almost never performs as desired when first realized. Both hardware and software system design environments generally deal with systems which are not working. When a system really works, the design and development process is generally over.

    Debugging involves identifying problems, analyzing the source of those problems, then changing the construction to fix the problem. (Hopefully, the fix will not itself create new problems.) This form of interactive analysis can be especially difficult because the realized design may not actually be what is described in the schematics, flow-charts, or other working documents: To some extent the real system is unknown.

    The most important part of debugging is to understand in great detail exactly what the system is supposed to do. In hardware debugging, it is common to repeatedly reset the system, and start a known sequence of events which causes a failure. Then, if one really does know the system, one can probe at various points and times and eventually track down the earliest point where the implemented system diverges from the design intent. The thing that causes the divergence is the bug. Actually doing this generally is harder than it sounds.

    Software debugging is greatly aided by a design and implementation process that decomposes complex tasks into small, testable procedures or modules, and then actually testing those procedures. Of course, sometimes the larger system fails anyway, in which case the procedure tests were insufficient, but they can be changed and the fixed procedure re-tested. Sometimes the hardest part of the debugging is to find some s