Subject: Re: Redundancy eliminator From: srt@duke.cs.duke.edu (Stephen R. Tate @ Duke University Computer Science Dept.; Durham, N.C.) In article <695@hhb.UUCP> istvan@hhb.UUCP (Istvan Mohos) writes: >In article <1991Feb1.081237.27790@santra.uucp>, >t31694c@kaira.hut.fi (Tapani Lindgren) writes: >>There is a simple way to get an unbiased random sequence from a >>biased one. It has been described in several recent postings, >>so I won't repeat it here. > >A number of us have apparently missed these articles. >Re. article <14983@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn): >>...I would find the existence of such an algorithm most surprising. From the fact that you posted these two statements together, it seems that you are assuming the two previous posters are talking about the same thing --- they're not. The first posting was about removing the bias from a truly random sequence --- this is fairly trivial (see below). If I remember correctly, Doug Gwyn was talking about removing redundancy and information content from English text. These are very, very different things. As for removing the bias from a biased (but random) bit stream, consider the following method (I'm sure there are more sophisticated ways, but this was what I could come up with off the top of my head). If you take a sequence of biased bits and XOR the values, the distribution of the result gets closer and closer to 50/50 (i.e., unbiased) depending on how many bits are in the sequence. This removes quite a bit of the bias --- in fact, you can get as close to unbiased as you want by simply using large sequences of bits for each resulting bit. The point here, I suppose, is that it should be cheaper to generate the biased bits, so you can generate a lot. I would bet that in general, you cannot remove all the bias... maybe for certain values of bias it's possible, but not in general. Of course, you could also do the following, but it is no longer guaranteed to halt (for all practical purposes it will halt, though): read two consecutive bits from the biased stream. If they're the same, then start over. If the bits are 0/1 (in that order), output 0; otherwise, output 1. If the probability of a 1 in the biased bitstream is p, then the probability of outputing 0 is p(1-p), the probability of outputing 1 is p(1-p), and the probability of having to start over is p*p+(1-p)*(1-p), so this is a completely unbiased bitstream. Notice that in practical situations, the bias is probably not very high, so this is a very good solution. However, the fact that it could continue forever without generating any bits is a little unsettling to theoreticians... -- Steve Tate srt@duke.cs.duke.edu Dept. of Computer Science Duke University Durham, NC 27706