This is a "proof of concept" design for a cryptographic audio noise source. The prototype is completely powered from my computer sound system microphone input. As far as I know, this is the first noise source to operate on microphone power. However, the design needs much more testing, and may need significant changes.
Computers vary widely in the microphone power they supply, and in microphone gain:
Because of the wide variation of microphone supply values
between computers, please let me know your results if you build
and use this design.
The ultimate goal here is the production of cryptographic unpredictable values for computer use. Despite protests to the contrary, producing unpredictable numbers from completely predictable computer operations is almost impossible.
The alternative is to exploit some sort of unpredictable external process. Various possibilities include:
Digital Sound: Modern computer sound systems can take an analog electrical noise signal, sample the instantaneous voltage level, and produce a 16-bit value representing that voltage. That is analog-to-digital (A-D) conversion. Digitized noise consists of values which can be considered unpredictable in ways that computer programs simply cannot achieve. However, noise values can be more predictable than one might at first expect.
Sound Files: Digitizing a sound signal every 23 microseconds or so can produce the data for a WAV file which we can save and later play as sound. The recording can be analyzed with the full array of tools developed for audio analysis. The recorded digital values in that same WAV file can be used to compute statistics and run tests. Other WAV files may be created to test the statistical computations or produce desired sounds.
Noise Quality: Shot noise theory predicts the resulting noise will be white, with power distributed according to bandwidth. If we simply assume that, we can predict that noise will have various characteristics. But real noise sources often have surprising differences, sometimes reaching the level of serious faults. (See my article: Experimental Characterization of Recorded Noise.)
A common option is to ignore the whole idea of noise flaws and not bother to test for them. However, the concept of "defense in depth," only works if each layer can be relied upon to play its part. While accepting some flaws may be necessary to satisfy other objectives, it is only reasonable to want the noise to be as good as possible. We check the quality of the noise by testing. To explore problems in analog noise, we must test analog noise, in this case, the recorded sample values.
It is easy to hide flaws by processing the noise signal. But just because flaws do not show up in fairly general tests does not mean that exploitable flaws are not present. It is important to test noise generators in ways that make flaws most obvious, so that even small flaws can be observed and, hopefully, addressed.
Many noise problems can be exposed by the value frequency distribution. To get that, we simply count of the number of times each digitized value occurs. White noise theory predicts that an accumulated noise histogram will take on the shape of a statistical normal distribution. When that does not happen, we need to fix the generator, get another, or find a way to live with low noise quality.
Noise Theory: The closer we measure real noise generators, the more obvious it becomes that noise theory does not completely describe real noise. Many noise sources are more than just slightly deviant, being instead very seriously flawed. And if a noise source does not behave as theory predicts, we cannot depend upon other predictions from that same theory. We can choose to hide this awkward fact behind confusion or post-processing, but hiding the problem will not make it go away, and may not prevent it from being exploited.
We want sample noise values to be independent. But the theory says that energy from all frequencies will be present. Low-frequency energy naturally produces a similar effect across many samples, which is a form of correlation instead of independence. We can reduce this somewhat by rolling off low frequencies. We can reduce it further by subtracting the previous sample from the current one. By using only the differences between samples, the independence of each sample is retained for use.
Autocorrelation: A common problem is that noise generators often produce a clear autocorrelation structure. (See the Autocorrelation Graph in "Characterization of ZENER1.WAV".) Since autocorrelation measures the ability of one value to predict another, autocorrelation is one of the most disturbing measures of cryptographic noise. So we need to test for autocorrelation, and if we find it, fix the noise source, get another, or find a way to handle it.
Post-Processing: Good digitized noise will have a statistical gaussian or normal distribution, instead of the flat distribution desired for random numbers. Typically the distribution is flattened in post-processing, by overloading a hash computation. By further overloading the hash computation, we assure it cannot be reversed, even though the hash is not "cryptographic." Since cryptographic properties are not needed (and since the effectiveness of cryptographic hashes is only assumed, not proven), the small, fast and deeply understood CRC is a good choice. (See my 1986 Dr. Dobb's article: The Great CRC Mystery.)
Assuming we have high quality noise in a normal statistical distribution, at least 2.4 times the amount of CRC state is processed, then the CRC result is used. With a 32-bit CRC, 80 bits or 10 bytes of noise data would be used to produce each 32 bits or 4 bytes of flattened and unpredictable result. Lower quality noise will need more input data for the same amount of output. It is important that the CRC not be re-initialized.
Post-processing occurs after sound digitization, which occurs in the computer. Thus, post-processing is done in software. The noise generator described here is the hardware source for the noise sound signal. It does not, can not, and should not do the post-processing.
It is important to be able to measure the original noise
signal, because post-processing hides noise faults.
To identify noise weakness, we must seek where the problem is
clearest.
Testing post-processed results cannot certify the physical
unpredictability we need.
For example, even though common statistical RNG's are predictable
and weak, they are specifically designed to do well on statistical
tests.
If we use those same tests on post-processed noise generator output,
the best they can do is confirm that the noise generator acts like
a weak statistical RNG.
And if that was good enough, we would not need a noise source.
Some particularly ominous noise generator designs may not be able
to demonstrate that random noise has any effect at all, which
means if the noise source fails, we will not know.
Shot Noise Sources: In the earlier article Junction Noise Measurements I I measured a wide range of semiconductor junction noise sources. In general, these were semiconductor junctions functioning under current flow, and we might expect the result to be shot noise. But the results are not what we would expect from shot noise. Some observations include:
These results highlight the problem of relying on the shot noise model of noise production. Typically, most semiconductor junctions exhibit a high noise or excess noise region not predicted by the model. The louder excess noise region conveniently needs less amplification, and so is often used. However, excess noise does not fit the shot noise model.
Shot noise is a statistical effect of event "clumping" when the events have an expected rate but independent times. DC current flows through a PN junction as a myriad of tiny electron pulses, each with independent launch timing. The independence of timing produces a variation which we see as a tiny AC noise signal riding on the DC current.
The independent launch timing that makes shot noise also is the basis for our belief in shot noise unpredictability. Unexpectedly high or excess noise is not explained as shot noise, and does not support belief in unpredictability. In my view, the excess noise region probably should not be used in cryptography, even though such use is very common.
Noise Source Selection: If we have 6 volts or more of power at reasonable current, a whole range of noise production options become available. One of the most attractive is the use of low-voltage metal-oxide-varistors (MOV's). Unfortunately, at micropower voltage and current levels, options are few.
One possibility sometimes mentioned is to simply use the background noise of the audio chain without any generator at all. But audio systems are deliberately constructed so that signals of reasonable level will override and mask tiny spurious signals (spurs) if any are present. AC hum, switching power-supply noise, transient runs of digital pulses, or even audio tones from nearby digital circuitry may well exist at extremely low audio levels, because they normally would be masked by any reasonable audio signal. But when there is no signal, there is no masking, and the background noise may compete with spurs that the audio designers did not see, or did not think important enough to kill.
Since different computers generally have different audio systems, the possibility of low-level spurs would require tests of each system. Every system would have to look for spurs and decide if the result was usable. Unfortunately, transient spurs produced by repeating program code may be all but undetectable to normal tests and in fact may not even be present during testing. Even FFT techniques may be insufficient to expose a periodicity in transient digital tones, even when periodicity exists. All this makes background noise a less desirable source, even if, after processing, it passes the same randomness tests as weak statistical RNG's.
Another possibility is to amplify raw shot noise. We could do that in a number of ways, but we will need about 100dB of total amplification. With micropower constraints, 100dB of audio amplification will be tough to achieve. And putting 100dB of audio amplification on a small board is asking to confront instability and oscillation issues. A better alternative may be to use a low-voltage band-gap reference, which, because of internal amplification, is notoriously noisy.
ICL8069: The ICL8069 is a 1.2 volt reference which functions like a precision zener diode. Since it is not intended to produce noise, there is a chance that future production or even different devices from the same production may be less noisy. However, the devices I have checked produce about 50uV RMS of analog noise using only 50uA of current at something over 1.2V. That is something like 40dB more than shot noise, and thus avoids 40dB of additional amplification. So we need to provide "only" about 60dB of reasonable micropower noise amplification.
Although the ICL8069 data sheet does not reveal the internal circuit, the magic zero temperature coefficient voltage of 1.23V indicates some form of bandgap reference. Bandgap references are notorious for high noise (as compared to, for example, a buried zener). A bandgap reference voltage is built from a transistor Vbe and a current mirror, run at a constant current. Obviously we cannot change the internal reference current and thus show shot noise variation. But noise measurements on the device do not show the peculiar and disturbing excess noise regions common in avalanching PN junctions.
Amplification: Three ordinary NPN transistors ought to be able to deliver 60dB of gain, provided they can be convinced to work with 1.5V or so and at under 30uA each. It turns out they can. A very simple feedback bias system for each stage keeps that transistor partly on, with both DC and AC feedback that improves linearity. An emitter resistor further improves linearity (and reduces gain) for that stage. Gain could be increased by using high-gain transistors and/or reducing feedback. Using another gain stage probably would exceed the power budget which allows microphone powering. Lower emitter resistor values do increase gain somewhat. Higher supply voltage seems to give proportionally higher output.
These simple amplifier stages would not be ideal for general audio. Normal audio signals often are composed of multiple frequencies with vastly different amplitudes. Large signals may cover a large part of the transistor curve, thus distorting. They also drag along tiny signals and distort them similarly. Even more seriously, signals that should be completely separate and independent may interact, as a cross-modulation.
In contrast, very small signals, like noise, can be amplified with low distortion because the signal reliably covers only a small part of the transistor curve. And even though a noise signal can be seen as many frequencies occurring simultaneously, they should have relatively similar amplitudes. Because of this, even very simple audio amplification can be satisfactory for noise. By testing the sampled analog noise, we can see serious distortion when that is a problem.
Power Filtering: The incoming power is filtered repeatedly through series resistors and shunt capacitors. This is a multi-stage low-pass filter. Each filter stage reduces noise. At each stage, even a relatively large power supply hum is reduced to well below the signal level where that power is used. Higher frequency tones and short pulses on the power line are attenuated even more. Active voltage regulation would be extremely tough at these power levels.
Noise Filtering: Autocorrelation is a significant problem in noise value generation. Low-audio-frequency components naturally affect multiple adjacent samples when we want each sample to be independent. To reduce this effect, noise signals below 1kHz are rolled off. This occurs in a multistage high-pass filter which mainly involves correctly sizing capacitors which are needed anyway. Given equal level noise signals of 2kHz and 60Hz, the filter will cut the 60Hz to 60dB below the 2kHz reference. Signal filtering also helps with power supply and wiring hum pickup issues.
Filtering out noise below 1kHz is common in noise measurement since it removes most of the 1/f noise. The 1kHz bandwidth loss represents only about 5 percent of the total noise power.
Signal filtering occurs in three high-pass stages which use the value of each interstage or blocking capacitor to set the low-frequency roll-off. Capacitors of .01uF start the roll-off at about 1kHz. Users who want full range noise may find that values around 0.5uF may work better for them. That will not significantly increase the noise level, however.
Capacitor Types: In all cases, the capacitors should be rated higher than the input voltage. In the prototype, I used 16V units for power filtering, and the interstage caps turned out to be 50V rated. Ceramic capacitors are not recommended for this application.
The interstage blocking capacitors should be film types. When we depend upon capacitor values for a specific filtering effect, we need the accurate and stable capacitance that a film can provide.
The power filter capacitors should be tantalum.
That makes the power filter increasingly effective as
interference frequencies increase from 60Hz through RF.
It is extremely important that tantalum capacitors be installed
with correct polarity.
Here, the negative side of each tantalum is connected to the
negative ground copper surface.
Interconnection: Connectors U1, U2 and U3 represent wires to the computer. Typically this is a stereo mini-plug with cable attached. When inserted in the computer microphone jack, the shield will be ground (U2), one wire will carry a small voltage (U1), and the other (U3) is the signal output from the board and input to the computer audio chain.
Power Protection: Resistor R5 is an external 1/8w unit intended to protect the power source against a board short. This would be important if battery power was used. R5 also represents the first stage of power filtering.
Reverse Voltage Protection: Transistor Q4 is a P-channel
MOSFET connected in reverse.
When the voltage at U1 is positive with respect to ground U2,
current would flow in the Drain, through the internal body diode
D4, then out the Source, even if the gate was off.
But with positive voltage on U1, the gate will be on, and there
will be almost no voltage drop at all across the transistor.
But when the voltage at U1 is negative, the body diode is
reverse biased and the gate off, so no current flows.
That protects against reversed power. (See "Power Polarity Protection" in
A Modern
Breadboarding Technology.)
The simulation program does not like the reversed P-MOSFET.
Simulating without the MOSFET produces results which correspond
to reality surprisingly well.
Power Filtering: Power filtering occurs in four resistor-capacitor stages: R5-C2, R13-C5, R14-C6, and R15-C7. The resistor values increase as less current flows through that stage. Capacitors could be larger, but should be tantalum. The 22uF value was the largest value I had at 16 volts, which thus supports an external 9V battery for flexibility.
Noise Generation: We assume that voltage reference V1, the ICL8069, produces about 60uV RMS noise. The V3 value of 1.2V is part of the device simulation. The device probably has about 40dB of internal amplification, and needs a minimum of about 50uA DC for operation.
Amplification: NPN transistor Q1 is the first amplifier stage. Pretty much any NPN should be usable. Resistor R2 is the collector load. Resistor R1 provides a simple negative feedback bias to keep the transistor partly on, so that it can respond to either positive or negative signals. This simple feedback biasing scheme is not particularly stable, but should be sufficient in this application. Emitter resistor R10 provides a modest amount of negative feedback both for improved bias stability and reduced distortion. Capacitor C1 couples the noise into the transistor base. The combination of C1, R1 and Q1 represent a high-pass filter stage which rolls off below about 1kHz.
Amplifier stages Q2 and Q3 are mostly the same thing. The emitter resistors are larger for increased negative feedback. That should help reduce the greater distortion expected when amplifying larger signals. We could get a few more dB of amplification by reducing those values or eliminating them entirely. Probably a better way to increase gain would be to use higher gain transistors. The transistors probably cost about a dime apiece; we could afford to spend a few bucks to get enough to select for higher gain. Or if the gain is too high, we could reduce total gain by up to 30dB by increasing all three emitter resistor values to as much as 5k.
Output Blocking: Output blocking capacitor C8 passes the noise signal while protecting the computer input from the DC voltage on the noise board (which of course came from the computer in the first place). The output capacitor needs to be bipolar (non-polar); I used a film.
Simulated Response: Audio work typically uses a log
frequency response which gives a graph like this:
For general audio such a response may be disturbing, but for
noise work, the noise power is more related to linear frequency
like this:
Note that almost all the noise power remains, with hum and
1/f problems removed.
Construction can be fairly casual.
I use the breadboard technology I described in the article
A Modern
Breadboarding Technology.
The actual board size is about 3 inches by 1 and 3/8 inches.
The transistors are standing straight up, and so are tough to see
in this straight down view.
The resistor values are given in 4 digits, of which the last is
a decade multiplier.
The P-MOSFET device in the SOIC package at the upper right happens
to be an IRF7416, but a very wide range of modern devices should
work just as well.
Here is a noise FFT.
This is a 20-scan average of a 2048 point FFT of the noise signal after digitization. The x-axis or frequency ranges from 0 to 24kHz, so each division is about 2.4kHz The y-axis or amplitude has increments of 5dB.
The signal seems to peak about 2.4kHz and then drop smoothly about 14dB to 19.2kHz. A frequency range from 2.4 to 19.2 is exactly a factor of 8, which is 3 octaves. Noting the hump near the end, we could be looking at a typical RC rolloff of 6dB/octave from 2.4kHz, with a modest resonance affecting the curve and starting a little below 19kHz. The question is where it comes from.
By disconnecting the noise board and connecting an audio
oscillator to the microphone input, I could check the response.
By manually varying the audio frequency while watching the
real-time FFT, I found the microphone input was mostly flat.