Tuesday, May 7, 2013

Steinhaus–Johnson–Trotter Algorithm

The Hamiltonian path in the Cayley graph of the symmetric group generated by the Steinhaus–Johnson–Trotter algorithm
The Steinhaus–Johnson–Trotter algorithm or Johnson–Trotter algorithm, also called plain changes, is an algorithm named after Hugo Steinhaus, Selmer M. Johnson and Hale F. Trotter that generates all of the permutations of n elements. Each permutation in the sequence that it generates differs from the previous permutation by swapping two adjacent elements of the sequence. Equivalently, this algorithm finds a Hamiltonian path in the permutohedron.
This method was known already to 17th-century English change ringers, and Sedgewick (1977) calls it "perhaps the most prominent permutation enumeration algorithm". As well as being simple and computationally efficient, it has the advantage that subsequent computations on the permutations that it generates may be sped up because of the fact that these permutations are so similar to each other.

Recursive structure

The sequence of permutations for a given number n can be formed from the sequence of permutations for n − 1 by placing the number n into each possible position in each of the shorter permutations. When the permutation on n − 1 items is an even permutation (as is true for the first, third, etc., permutations in the sequence) then the number n is placed in all possible positions in descending order, from n down to 1; when the permutation on n − 1 items is odd, the number n is placed in all the possible positions in ascending order.
Thus, from the single permutation on one element,
1
one may place the number 2 in each possible position in descending order to form a list of two permutations on two elements,
1 2
2 1
Then, one may place the number 3 in each of three different positions for these three permutations, in descending order for the first permutation 1 2, and then in ascending order for the permutation 2 1:
1 2 3
1 3 2
3 1 2
3 2 1
2 3 1
2 1 3
At the next level of recursion, the number 4 would be placed in descending order into 1 2 3, in ascending order into 1 3 2, in descending order into 3 1 2, etc. The same placement pattern, alternating between descending and ascending placements of n, applies for any larger value of n. In this way, each permutation differs from the previous one either by the single-position-at-a-time motion of n, or by a change of two smaller numbers inherited from the previous sequence of shorter permutations. In either case this difference is just the transposition of two adjacent elements. When n > 1 the first and final elements of the sequence, also, differ in only two adjacent elements (the positions of the numbers 1 and 2), as may be shown by induction.
Although this sequence may be generated by a recursive algorithm that constructs the sequence of smaller permutations and then performs all possible insertions of the largest number into the recursively-generated sequence, the actual Steinhaus–Johnson–Trotter algorithm avoids recursion, instead computing the same sequence of permutations by an iterative method.

Algorithm

As described by Johnson (1963), the algorithm for generating the next permutation from a given permutation π is particularly simple:
  • For each i from 1 to n, let xi be the position where the value i is placed in permutation π. If the order of the numbers from 1 to i − 1 in permutation π defines an even permutation, let yi = xi − 1; otherwise, let yi = xi + 1.
  • Find the largest number i for which yi defines a valid position in permutation π that contains a number smaller than i. Swap the values in positions xi and yi.
When no number i can be found meeting the conditions of the second step of the algorithm, the algorithm has reached the final permutation of the sequence and terminates. This procedure may be implemented in O(n) time per permutation.
Because this method generates permutations that alternate between being even and odd, it may easily be modified to generate only the even permutations or only the odd permutations: to generate the next permutation of the same parity from a given permutation, simply apply the same procedure twice.

Even's speedup

A subsequent improvement by Shimon Even provides an improvement to the running time of the algorithm by storing additional information for each element in the permutation: its position, and a direction (positive, negative, or zero) in which it is currently moving (essentially, this is the same information computed using the parity of the permutation in Johnson's version of the algorithm). Initially, the direction of the number 1 is zero, and all other elements have a negative direction:
1 −2 −3
At each step, the algorithm finds the largest element with a nonzero direction, and swaps it in the indicated direction:
1 −3 −2
If this causes the chosen element to reach the first or last position within the permutation, or if the next element in the same direction is larger than the chosen element, the direction of the chosen element is set to zero:
3 1 −2
After each step, all elements greater than the chosen element have their directions set to positive or negative, according to whether they are concentrated at the start or the end of the permutation respectively. Thus, in this example, when the number 2 moves, the number 3 becomes marked with a direction again:
+3 2 1
The remaining two steps of the algorithm for n = 3 are:
2 +3 1
2 1 3
When all numbers become unmarked, the algorithm terminates.
This algorithm takes time O(i) for every step in which the largest number to move is n − i + 1. Thus, the swaps involving the number n take only constant time; since these swaps account for all but a 1/n fraction of all of the swaps performed by the algorithm, the average time per permutation generated is also constant, even though a small number of permutations will take a larger amount of time.
A more complex loopless version of the same procedure allows it to be performed in constant time per permutation in every case; however, the modifications needed to eliminate loops from the procedure make it slower in practice.
There is a Java implementation of a generic iterator that generates all swap moves required to generate the full set of permutations, using this algorithm. This implementation can be found here.
The algorithm defines a Hamiltonian path in a Cayley graph of the symmetric group. The inverse permutations define a path in the permutohedron:
Cayley graph
Permutohedron
Permutations form a Gray code. The swapped elements are always adjacent.
Permutations, inversion vectors and inversion sets form a Gray code.
Permutations with green or orange background are odd. The smaller numbers below the permutations are the inversion vectors. Red marks indicate swapped elements. Compare list in natural order.

Geometric interpretation

The set of all permutations of n items may be represented geometrically by a permutohedron, the polytope formed from the convex hull of n! vectors, the permutations of the vector (1,2,...n). Although defined in this way in n-dimensional space, it is actually an (n − 1)-dimensional polytope; for example, the permutohedron on four items is a three-dimensional polyhedron, the truncated octahedron. If each vertex of the permutohedron is labeled by the inverse permutation to the permutation defined by its vertex coordinates, the resulting labeling describes a Cayley graph of the symmetric group of permutations on n items, as generated by the permutations that swap adjacent pairs of items. Thus, each two consecutive permutations in the sequence generated by the Steinhaus–Johnson–Trotter algorithm correspond in this way to two vertices that form the endpoints of an edge in the permutohedron, and the whole sequence of permutations describes a Hamiltonian path in the permutohedron, a path that passes through each vertex exactly once. If the sequence of permutations is completed by adding one more edge from the last permutation to the first one in the sequence, the result is instead a Hamiltonian cycle.

Relation to Gray codes

A Gray code for numbers in a given radix is a sequence that contains each number up to a given limit exactly once, in such a way that each pair of consecutive numbers differs by one in a single digit. The n! permutations of the n numbers from 1 to n may be placed in one-to-one correspondence with the n! numbers from 0 to n! − 1 by pairing each permutation with the sequence of numbers ci that count the number of positions in the permutation that are to the right of value i and that contain a value less than i (that is, the number of inversions for which i is the larger of the two inverted values), and then interpreting these sequences as numbers in the factorial number system, that is, the mixed radix system with radix sequence (1,2,3,4,...). For instance, the permutation (3,1,4,5,2) would give the values c1 = 0, c2 = 0, c3 = 2, c4 = 1, and c5 = 1. The sequence of these values, (0,0,2,1,1), gives the number
0 × 0! + 0 × 1! + 2 × 2! + 1 × 3! + 1 × 4! = 34.
Consecutive permutations in the sequence generated by the Steinhaus–Johnson–Trotter algorithm have numbers of inversions that differ by one, forming a Gray code for the factorial number system.
More generally, combinatorial algorithms researchers have defined a Gray code for a set of combinatorial objects to be an ordering for the objects in which each two consecutive objects differ in the minimal possible way. In this generalized sense, the Steinhaus–Johnson–Trotter algorithm generates a Gray code for the permutations themselves.

History

The algorithm is named after Hugo Steinhaus, Selmer M. Johnson and Hale F. Trotter. Johnson and Trotter discovered the algorithm independently of each other in the early 1960s. A book by Steinhaus, originally published in 1958 and translated into English in 1963, describes a related puzzle of generating all permutations by a system of particles, each moving at constant speed along a line and swapping positions when one particle overtakes another. No solution is possible for n > 3, because the number of swaps is far fewer than the number of permutations, but the Steinhaus–Johnson–Trotter algorithm describes the motion of particles with non-constant speeds that generate all permutations.
Outside of mathematics, the same method was known for much longer as a method for change ringing of church bells: it gives a procedure by which a set of bells can be rung through all possible permutations, changing the order of only two bells per change. These so-called "plain changes" were recorded as early as 1621 for four bells, and a 1677 book by Fabian Stedman lists the solutions for up to six bells. More recently, change ringers have abided by a rule that no bell may stay in the same position for three consecutive permutations; this rule is violated by the plain changes, so other strategies that swap multiple bells per change have been devised.

Fisher–Yates Shuffle

The Fisher–Yates shuffle (named after Ronald Fisher and Frank Yates), also known as the Knuth shuffle (after Donald Knuth), is an algorithm for generating a random permutation of a finite set—in plain terms, for randomly shuffling the set. A variant of the Fisher–Yates shuffle, known as Sattolo's algorithm, may be used to generate random cycles of length n instead. Properly implemented, the Fisher–Yates shuffle is unbiased, so that every permutation is equally likely. The modern version of the algorithm is also rather efficient, requiring only time proportional to the number of items being shuffled and no additional storage space.
The basic process of Fisher–Yates shuffling is similar to randomly picking numbered tickets out of a hat, or cards from a deck, one after another until there are no more left. What the specific algorithm provides is a way of doing this numerically in an efficient and rigorous manner that, properly done, guarantees an unbiased result.

Fisher and Yates' original method

The Fisher–Yates shuffle, in its original form, was described in 1938 by Ronald A. Fisher and Frank Yates in their book Statistical tables for biological, agricultural and medical research. (Later editions describe a somewhat different method attributed to C. R. Rao.) Their method was designed to be implemented using pencil and paper, with a precomputed table of random numbers as the source of randomness. The basic method given for generating a random permutation of the numbers 1–N goes as follows:
  1. Write down the numbers from 1 to N.
  2. Pick a random number k between one and the number of unstruck numbers remaining (inclusive).
  3. Counting from the low end, strike out the kth number not yet struck out, and write it down elsewhere.
  4. Repeat from step 2 until all the numbers have been struck out.
  5. The sequence of numbers written down in step 3 is now a random permutation of the original numbers.
Provided that the random numbers picked in step 2 above are truly random and unbiased, so will the resulting permutation be. Fisher and Yates took care to describe how to obtain such random numbers in any desired range from the supplied tables in a manner which avoids any bias. They also suggested the possibility of using a simpler method — picking random numbers from one to N and discarding any duplicates—to generate the first half of the permutation, and only applying the more complex algorithm to the remaining half, where picking a duplicate number would otherwise become frustratingly common.

The modern algorithm

The modern version of the Fisher–Yates shuffle, designed for computer use, was introduced by Richard Durstenfeld in 1964 in Communications of the ACM volume 7, issue 7, as "Algorithm 235: Random permutation", and was popularized by Donald E. Knuth in volume 2 of his book The Art of Computer Programming as "Algorithm P". Neither Durstenfeld nor Knuth, in the first edition of his book, acknowledged the earlier work of Fisher and Yates in any way, and may not have been aware of it. Subsequent editions of The Art of Computer Programming do, however, mention Fisher and Yates' contribution.
The algorithm described by Durstenfeld differs from that given by Fisher and Yates in a small but significant way. Whereas a naive computer implementation of Fisher and Yates' method would spend needless time counting the remaining numbers in step 3 above, Durstenfeld's solution is to move the "struck" numbers to the end of the list by swapping them with the last unstruck number at each iteration. This reduces the algorithm's time complexity to O(n), compared to O(n2) for the naive implementation.This change gives the following algorithm (for a zero-based array).
To shuffle an array a of n elements (indices 0..n-1):
for i from n − 1 downto 1 do
j ← random integer with 0 ≤ ji
exchange a[j] and a[i]

The "inside-out" algorithm

The Fisher–Yates shuffle, as implemented by Durstenfeld, is an in-place shuffle. That is, given a preinitialized array, it shuffles the elements of the array in place, rather than producing a shuffled copy of the array. This can be an advantage if the array to be shuffled is large.
To simultaneously initialize and shuffle an array, a bit more efficiency can be attained by doing an "inside-out" version of the shuffle. In this version, one successively places element number i into a random position among the first i positions in the array, after moving the element previously occupying that position to position i. In case the random position happens to be number i, this "move" (to the same place) involves an uninitialised value, but that does not matter, as the value is then immediately overwritten. No separate initialization is needed, and no exchange is performed. In the common case where source is defined by some simple function, such as the integers from 0 to n - 1, source can simply be replaced with the function since source is never altered during execution.
To initialize an array a of n elements to a randomly shuffled copy of source, both 0-based:
a[0] ← source[0]
for i from 1 to n − 1 do
j ← random integer with 0 ≤ ji
a[i] ← a[j]
a[j] ← source[i]
The inside-out shuffle can be seen to be correct by induction; every one of the n! different sequences of random numbers that could be obtained from the calls of random will produce a different permutation of the values, so all of these are obtained exactly once.
Another advantage of this technique is that the algorithm can be modified so that even when we do not know "n", the number of elements in source, we can still generate a uniformly distributed random permutation of the source data. Below the array a is built iteratively starting from empty, and a.length represents the current number of elements seen.
To initialize an empty array a to a randomly shuffled copy of source whose length is not known:
while source.moreDataAvailable
j ← random integer with 0 ≤ ja.length
if j = a.length
a.append(source.next)
else
a.append(a[j])
a[j] ← source.next

Examples

Pencil-and-paper method

As an example, we'll permute the numbers from 1 to 8 using Fisher and Yates' original method. We'll start by writing the numbers out on a piece of scratch paper:
RangeRollScratchResult
  1 2 3 4 5 6 7 8 
Now we roll a random number k from 1 to 8—let's make it 3—and strike out the kth (i.e. third) number (3, of course) on the scratch pad and write it down as the result:
RangeRollScratchResult
1–831 2 3 4 5 6 7 83
Now we pick a second random number, this time from 1 to 7: it turns out to be 4. Now we strike out the fourth number not yet struck off the scratch pad—that's number 5—and add it to the result:
RangeRollScratchResult
1–741 2 3 4 5 6 7 83 5
Now we pick the next random number from 1 to 6, and then from 1 to 5, and so on, always repeating the strike-out process as above:
RangeRollScratchResult
1–651 2 3 4 5 6 7 83 5 7
1–531 2 3 4 5 6 7 83 5 7 4
1–441 2 3 4 5 6 7 83 5 7 4 8
1–311 2 3 4 5 6 7 83 5 7 4 8 1
1–221 2 3 4 5 6 7 83 5 7 4 8 1 6
  1 2 3 4 5 6 7 83 5 7 4 8 1 6 2

Modern method

We'll now do the same thing using Durstenfeld's version of the algorithm: this time, instead of striking out the chosen numbers and copying them elsewhere, we'll swap them with the last number not yet chosen. We'll start by writing out the numbers from 1 to 8 as before:
RangeRollScratchResult
  1 2 3 4 5 6 7 8 
For our first roll, we roll a random number from 1 to 8: this time it's 6, so we swap the 6th and 8th numbers in the list:
RangeRollScratchResult
1–861 2 3 4 5 8 76
The next random number we roll from 1 to 7, and turns out to be 2. Thus, we swap the 2nd and 7th numbers and move on:
RangeRollScratchResult
1–721 7 3 4 5 82 6
The next random number we roll is from 1 to 6, and just happens to be 6, which means we leave the 6th number in the list (which, after the swap above, is now number 8) in place and just move to the next step. Again, we proceed the same way until the permutation is complete:
RangeRollScratchResult
1–661 7 3 4 58 2 6
1–515 7 3 41 8 2 6
1–435 7 43 1 8 2 6
1–335 74 3 1 8 2 6
1–2175 4 3 1 8 2 6
At this point there's nothing more that can be done, so the resulting permutation is 7 5 4 3 1 8 2 6.

Variants

Sattolo's algorithm

A very similar algorithm was published in 1986 by Sandra Sattolo for generating uniformly distributed cycles of (maximal) length n. The only difference between Durstenfeld's and Sattolo's algorithms is that in the latter, in step 2 above, the random number j is chosen from the range between 1 and i−1 (rather than between 1 and i) inclusive. This simple change modifies the algorithm so that the resulting permutation always consists of a single cycle.
In fact, as described below, it's quite easy to accidentally implement Sattolo's algorithm when the ordinary Fisher–Yates shuffle is intended. This will bias the results by causing the permutations to be picked from the smaller set of (n−1)! cycles of length N, instead of from the full set of all n! possible permutations.
The fact that Sattolo's algorithm always produces a cycle of length n can be shown by induction. Assume by induction that after the initial iteration of the loop, the remaining iterations permute the first n − 1 elements according to a cycle of length n − 1 (those remaining iterations are just Sattolo's algorithm applied to those first n − 1 elements). This means that tracing the initial element to its new position p, then the element originally at position p to its new position, and so forth, one only gets back to the initial position after having visited all other positions. Suppose the initial iteration swapped the final element with the one at (non-final) position k, and that the subsequent permutation of first n − 1 elements then moved it to position l; we compare the permutation π of all n elements with that remaining permutation σ of the first n − 1 elements. Tracing successive positions as just mentioned, there is no difference between π and σ until arriving at position k. But then, under π the element originally at position k is moved to the final position rather than to position l, and the element originally at the final position is moved to position l. From there on, the sequence of positions for π again follows the sequence for σ, and all positions will have been visited before getting back to the initial position, as required.
As for the equal probability of the permutations, it suffices to observe that the modified algorithm involves (n−1)! distinct possible sequences of random numbers produced, each of which clearly produces a different permutation, and each of which occurs—assuming the random number source is unbiased—with equal probability. The (n−1)! different permutations so produced precisely exhaust the set of cycles of length n: each such cycle has a unique cycle notation with the value n in the final position, which allows for (n−1)! permutations of the remaining values to fill the other positions of the cycle notation.
A sample implementation of Sattolo's algorithm in Python is:
from random import randrange

def sattoloCycle(items):
i = len(items)
while i > 1:
i = i - 1
j = randrange(i) # 0 <= j <= i-1
items[j], items[i] = items[i], items[j]
return

Comparison with other shuffling algorithms

The Fisher–Yates shuffle is quite efficient; indeed, its asymptotic time and space complexity are optimal. Combined with a high-quality unbiased random number source, it is also guaranteed to produce unbiased results. Compared to some other solutions, it also has the advantage that, if only part of the resulting permutation is needed, it can be stopped halfway through, or even stopped and restarted repeatedly, generating the permutation incrementally as needed.
An alternative method assigns a random number to each element of the set to be shuffled and then sorts the set according to the assigned numbers. The sorting method has worse asymptotic time complexity: sorting is O(n log n) which is worse than Fisher–Yates' O(n). Like the Fisher–Yates shuffle, the sorting method produces unbiased results, but the sorting method may be more tolerant of certain kinds of bias in the random numbers.[citation needed] However, care must be taken to ensure that the assigned random numbers are never duplicated, since sorting algorithms typically don't order elements randomly in case of a tie.
A variant of the above method that has seen some use in languages that support sorting with user-specified comparison functions is to shuffle a list by sorting it with a comparison function that returns random values. However, this is an extremely bad method: it is very likely to produce highly non-uniform distributions, which in addition depends heavily on the sorting algorithm used. For instance suppose quicksort is used as sorting algorithm, with a fixed element selected as first pivot element. The algorithm starts comparing the pivot with all other elements to separate them into those less and those greater than it, and the relative sizes of those groups will determine the final place of the pivot element. For a uniformly distributed random permutation, each possible final position should be equally likely for the pivot element, but if each of the initial comparisons returns "less" or "greater" with equal probability, then that position will have a binomial distribution for p = 1/2, which gives positions near the middle of the sequence with a much higher probability for than positions near the ends. Other sorting methods like merge sort may produce results that appear more uniform, but are not quite so either, since merging two sequences by repeatedly choosing one of them with equal probability (until the choice is forced by the exhaustion of one sequence) does not produce results with a uniform distribution; instead the probability to choose a sequence should be proportional to the number of elements left in it. In fact no method that uses only two-way random events with equal probability ("coin flipping"), repeated a bounded number of times, can produce permutations of a sequence (of more than two elements) with a uniform distribution, because every execution path will have as probability a rational number with as denominator a power of 2, while the required probability 1/n! for each possible permutation is not of that form.
In principle this shuffling method can even result in program failures like endless loops or access violations, because the correctness of a sorting algorithm may depend on properties of the order relation (like transitivity) that a comparison producing random values will certainly not have. While this kind of behaviour should not occur with sorting routines that never perform a comparison whose outcome can be predicted with certainty (based on previous comparisons), there can be valid reasons for deliberately making such comparisons. For instance the fact that any element should compare equal to itself allows using them as sentinel value for efficiency reasons, and if this is the case, a random comparison function would break the sorting algorithm.

Potential sources of bias

Care must be taken when implementing the Fisher–Yates shuffle, both in the implementation of the algorithm itself and in the generation of the random numbers it is built on, otherwise the results may show detectable bias. A number of common sources of bias have been listed below.

Implementation errors

A common error when implementing the Fisher–Yates shuffle is to pick the random numbers from the wrong range. The flawed algorithm may appear to work correctly, but it will not produce each possible permutation with equal probability, and it may not produce certain permutations at all. For example, a common off-by-one error would be choosing the index j of the entry to swap in the example above to be always strictly less than the index i of the entry it will be swapped with. This turns the Fisher–Yates shuffle into Sattolo's algorithm, which produces only permutations consisting of a single cycle involving all elements: in particular, with this modification, no element of the array can ever end up in its original position.
Order bias from incorrect implementation
Order bias from incorrect implementation - n = 1000
Similarly, always selecting j from the entire range of valid array indices on every iteration also produces a result which is biased, albeit less obviously so. This can be seen from the fact that doing so yields nn distinct possible sequences of swaps, whereas there are only n! possible permutations of an n-element array. Since nn can never be evenly divisible by n! when n > 2 (as the latter is divisible by n−1, which shares no prime factors with n), some permutations must be produced by more of the nn sequences of swaps than others. As a concrete example of this bias, observe the distribution of possible outcomes of shuffling a three-element array [1, 2, 3]. There are 6 possible permutations of this array (3! = 6), but the algorithm produces 27 possible shuffles (33 = 27). In this case, [1, 2, 3], [3, 1, 2], and [3, 2, 1] each result from 4 of the 27 shuffles, while each of the remaining 3 permutations occurs in 5 of the 27 shuffles.
The matrix to the right shows the probability of each element in a list of length 7 ending up in any other position. Observe that for most elements, ending up in their original position (the matrix's main diagonal) has lowest probability, and moving one slot backwards has highest probability.

Modulo bias

Doing a Fisher–Yates shuffle involves picking uniformly distributed random integers from various ranges. Most random number generators, however—whether true or pseudorandom—will only directly provide numbers in some fixed range, such as, say, from 0 to 232−1. A simple and commonly used way to force such numbers into a desired smaller range is to apply the modulo operator; that is, to divide them by the size of the range and take the remainder. However, the need, in a Fisher–Yates shuffle, to generate random numbers in every range from 0–1 to 0–n pretty much guarantees that some of these ranges will not evenly divide the natural range of the random number generator. Thus, the remainders will not always be evenly distributed and, worse yet, the bias will be systematically in favor of small remainders.
For example, assume that your random number source gives numbers from 0 to 99 (as was the case for Fisher and Yates' original tables), and that you wish to obtain an unbiased random number from 0 to 15. If you simply divide the numbers by 16 and take the remainder, you'll find that the numbers 0–3 occur about 17% more often than others. This is because 16 does not evenly divide 100: the largest multiple of 16 less than or equal to 100 is 6×16 = 96, and it is the numbers in the incomplete range 96–99 that cause the bias. The simplest way to fix the problem is to discard those numbers before taking the remainder and to keep trying again until a number in the suitable range comes up. While in principle this could, in the worst case, take forever, the expected number of retries will always be less than one.
A related problem occurs with implementations that first generate a random floating-point number—usually in the range [0,1)—and then multiply it by the size of the desired range and round down. The problem here is that random floating-point numbers, however carefully generated, always have only finite precision. This means that there are only a finite number of possible floating point values in any given range, and if the range is divided into a number of segments that doesn't divide this number evenly, some segments will end up with more possible values than others. While the resulting bias will not show the same systematic downward trend as in the previous case, it will still be there.

Pseudorandom generators: problems involving state space, seeding, and usage

An additional problem occurs when the Fisher–Yates shuffle is used with a pseudorandom number generator or PRNG: as the sequence of numbers output by such a generator is entirely determined by its internal state at the start of a sequence, a shuffle driven by such a generator cannot possibly produce more distinct permutations than the generator has distinct possible states. Even when the number of possible states exceeds the number of permutations, the irregular nature of the mapping from sequences of numbers to permutations means that some permutations will occur more often than others. Thus, to minimize bias, the number of states of the PRNG should exceed the number of permutations by at least several orders of magnitude.
For example, the built-in pseudorandom number generator provided by many programming languages and/or libraries may often have only 32 bits of internal state, which means it can only produce 232 different sequences of numbers. If such a generator is used to shuffle a deck of 52 playing cards, it can only ever produce a very small fraction of the 52! ≈ 2225.6 possible permutations. It's impossible for a generator with less than 226 bits of internal state to produce all the possible permutations of a 52-card deck. It has been suggested that confidence that the shuffle is unbiased can only be attained with a generator with more than about 250 bits of state.[citation needed]
Also, of course, no pseudorandom number generator can produce more distinct sequences, starting from the point of initialization, than there are distinct seed values it may be initialized with. Thus, a generator that has 1024 bits of internal state but which is initialized with a 32-bit seed can still only produce 232 different permutations right after initialization. It can produce more permutations if one exercises the generator a great many times before starting to use it for generating permutations, but this is a very inefficient way of increasing randomness: supposing one can arrange to use the generator a random number of up to a billion, say 230 for simplicity, times between initialization and generating permutations, then the number of possible permutations is still only 262.
A further problem occurs when a simple linear congruential PRNG is used with the divide-and-take-remainder method of range reduction described above. The problem here is that the low-order bits of a linear congruential PRNG are less random than the high-order ones: the low n bits of the generator themselves have a period of at most 2n. When the divisor is a power of two, taking the remainder essentially means throwing away the high-order bits, such that one ends up with a significantly less random value. This is an example of the general rule that a poor-quality RNG or PRNG will produce poor-quality shuffles.
Finally, it is to be noted that even with perfect random number generation, flaws can be introduced into an implementation by improper usage of the generator. For example, suppose a Java implementation creates a new generator for each call to the shuffler, without passing constructor arguments. The generator will then be default-seeded by the language's time-of-day (System.currentTimeMillis() in the case of Java). So if two callers call the shuffler within a time-span less than the granularity of the clock (one millisecond in the case of Java), the generators they create will be identical, and (for arrays of the same length) the same permutation will be generated. This is almost certain to happen if the shuffler is called many times in rapid succession, leading to an extremely non-uniform distribution in such cases; it can also apply to independent calls from different threads. A more robust Java implementation would use a single static instance of the generator defined outside the shuffler function.

Factorial Number System

In combinatorics, the factorial number system, also called factoradic, is a mixed radix numeral system adapted to numbering permutations. It is also called factorial base, although factorials do not function as base, but as place value of digits. By converting a number less than n! to factorial representation, one obtains a sequence of n digits that can be converted to a permutation of n in a straightforward way, either using them as Lehmer code or as inversion table representation; in the former case the resulting map from integers to permutations of n lists them in lexicographical order. General mixed radix systems were studied by Georg Cantor. The term "factorial number system" is used by Knuth, while the French equivalent "numération factorielle" was first used in 1888. The term "factoradic", which is a portmanteau of factorial and mixed radix, appears to be of more recent date.

Definition

The factorial number system is a mixed radix numeral system: the i-th digit from the right has base i, which means that the digit must be strictly less than i, and that (taking into account the bases of the less significant digits) its value to be multiplied by (i − 1)! (its place value).
Radix87654321
Place value7!6!5!4!3!2!1!0!
Place value in decimal5040720120246211
Highest digit allowed76543210
From this it follows that the rightmost digit is always 0, the second can be 0 or 1, the third 0, 1 or 2, and so on. The factorial number system is sometimes defined with the 0! place omitted because it is always zero (sequence A007623 in OEIS). Conversely, a further unchanging zero digit may be added in the rightmost position for the 0! place.
In this article, a factorial number representation will be flagged by a subscript "!", so for instance 341010! stands for 364514031201, whose value is
  • =3×5! + 4×4! + 1×3! + 0×2! + 1×1! + 0×0! 
  • =((((3×5 + 4)×4 + 1)×3 + 0)×2 + 1)×1 + 0
  • = 46310.
(Note that the place value is one less than the radix position, which is why these equations begin with 5!.)
General properties of mixed radix number systems also apply to the factorial number system. For instance, one can convert a number into factorial representation producing digits from right to left, by repeatedly dividing the number by the place values (1, 2, 3, ...), taking the remainder as digits, and continuing with the integer quotient, until this quotient becomes 0.
For example, 46310 can be transformed into a factorial representation by these successive divisions.
  • 463 ÷ 5! = 3 with a remainder of 103
  • 103 ÷ 4! = 4 and remainder 7
  • 7 ÷ 3! = 1 and remainder 1
  • 1 ÷ 2! = 0 and remainder 1
  • 1 ÷ 1! = 1 and remainder 0
  • 0 ÷ 0! = 0 and remainder 0
In principle, this system may be extended to represent fractional numbers, though rather than the natural extension of place values (−1)!, (−2)!, etc., which are undefined, the symmetric choice of radix values n = 0, 1, 2, 3, 4, etc. after the point may be used instead. Again, the 0 and 1 places may be omitted as these are always zero. The corresponding place values are therefore 1/1, 1/1, 1/2, 1/6, 1/24, ..., 1/n!, etc.

Examples

Here are the first twenty-four numbers, counting from zero.
The table on the left shows permutations, and inversion vectors (which are reflected factorial numbers) below them. Another column shows the inversion sets. The digit sums of the inversion vectors (or factorial numbers) and the cardinalities of the inversion sets are equal (and have the same parity as the permutation). They form the sequence OEISA034968.
Loupe light.svg Permutohedron graph
showing permutations and their inversion vectors
(compare version with factorial numbers)
The arrows indicate the bitwise less or equal \le relation.
Symmetric group 4; permutation list with matrices.svg
decimalfactorial
00
110
2100
3110
4200
5210
61000
71010
81100
91110
101200
111210
122000
132010
142100
152110
162200
172210
183000
193010
203100
213110
223200
233210
For another example, the greatest number that could be represented with six digits would be 543210! which equals 719 in decimal:
5×5! + 4×4! + 3x3! + 2×2! + 1×1! + 0×0!.
Clearly the next factorial number representation after 543210! is 1000000! which designates 6! = 72010, the place value for the radix-7 digit. So the former number, and its summed out expression above, is equal to:
6! − 1.
The factorial number system provides a unique representation for each natural number, with the given restriction on the "digits" used. No number can be represented in more than one way because the sum of consecutive factorials multiplied by their index is always the next factorial minus one:
 \sum_{i=0}^n {i\cdot i!} = {(n+1)!} - 1.
This can be easily proved with mathematical induction.
However, when using Arabic numerals to write the digits (and not including the subscripts as in the above examples), their simple concatenation becomes ambiguous for numbers having a "digit" greater than 9. The smallest such example is the number 10 × 10! = 3628800010, which may be written A0000000000!, but not 100000000000! which denotes 11!=3991680010. Thus using letters A–Z to denote digits 10, ..., 35 as in other base-N make the largest representable number 36! − 1=37199332678990121746799944815083519999999910. For arbitrarily greater numbers one has to choose a base for representing individual digits, say decimal, and provide a separating mark between them (for instance by subscripting each digit by its base, also given in decimal). In fact the factorial number system itself is not truly a numeral system in the sense of providing a representation for all natural numbers using only a finite alphabet of symbols.

Permutations

There is a natural mapping between the integers 0, ..., n! − 1 (or equivalently the numbers with n digits in factorial representation) and permutations of n elements in lexicographical order, when the integers are expressed in factoradic form. This mapping has been termed the Lehmer code (or inversion table). For example, with n = 3, such a mapping is
decimalfactorialpermutation
010000!(0,1,2)
110010!(0,2,1)
210100!(1,0,2)
310110!(1,2,0)
410200!(2,0,1)
510210!(2,1,0)
The leftmost factoradic digit 0, 1, or 2 is chosen as the first permutation digit from the ordered list (0,1,2) and is removed from the list. Think of this new list as zero indexed and each successive digit dictates which of the remaining elements is to be chosen. If the second factoradic digit is "0" then the first element of the list is selected for the second permutation digit and is then removed from the list. Similarly if the second factoradic digit is "1", the second is selected and then removed. The final factoradic digit is always "0", and since the list now contains only one element it is selected as the last permutation digit.
The process may become clearer with a longer example. For example, here is how the digits in the factoradic 4041000! (equal to 298210) pick out the digits in (4,0,6,2,1,3,5), the 2982nd permutation of the numbers 0 through 6.
                                 4041000! → (4,0,6,2,1,3,5)
factoradic: 4 0 4 1 0 0 0!
| | | | | | |
(0,1,2,3,4,5,6) -> (0,1,2,3,5,6) -> (1,2,3,5,6) -> (1,2,3,5) -> (1,3,5) -> (3,5) -> (5)
| | | | | | |
permutation:(4, 0, 6, 2, 1, 3, 5)
A natural index for the group direct product of two permutation groups is the concatenation of two factoradic numbers, with two subscript "!"s.
           concatenated
decimal factoradics permutation pair
010 000!000! ((0,1,2),(0,1,2))
110 000!010! ((0,1,2),(0,2,1))
...
510 000!210! ((0,1,2),(2,1,0))
610 010!000! ((0,2,1),(0,1,2))
710 010!010! ((0,2,1),(0,2,1))
...
2210 110!200! ((1,2,0),(2,0,1))
...
3410 210!200! ((2,1,0),(2,0,1))
3510 210!210! ((2,1,0),(2,1,0))

Fractional values

Unlike single radix systems whose place values are basen for both positive and negative integral n, the factorial number base cannot be extended to negative place values as these would be (−1)!, (−2)! and so on, and these values are undefined.
One possible extension is therefore to use 1/0!, 1/1!, 1/2!, 1/3!, ..., 1/n! etc. instead, possibly omitting the 1/0! and 1/1! places which are always zero.
With this method, all rational numbers have a terminating expansion, whose length in 'digits' is less than or equal to the denominator of the rational number represented. This may be proven by considering that there exists a factorial for any integer and therefore the denominator divides into its own factorial even if it does not divide into any smaller factorial.
By necessity, therefore, the factoradic expansion of the reciprocal of a prime has a length of exactly that prime (less one if the 1/1! place is omitted). It can also be proven that the last 'digit' or term of the representation of a rational with prime denominator is equal to the difference between the numerator and the prime denominator.
There is also a non-terminating equivalent for every rational number akin to the fact that in decimal 0.24999... = 0.25 = 1/4 and 0.999... = 1, etc., which can be created by reducing the final term by 1 and then filling in the remaining infinite number of terms with the highest value possible for the radix of that position.
In the following selection of examples, spaces are used to separate the place values, otherwise represented in decimal, and the places whose values are always zero (1!, 0!, 1/1!) have been omitted. The rational numbers on the left are also in decimal:
  • 1/2 = 0.1_! = 0.0\ 2\ 3\ 4\ 5\ 6\ 7\ 8\ 9\ 10..._!
  • 1/3 = 0.0\ 2_!
  • 1/11 = 0.0\ 0\ 2\ 0\ 5\ 3\ 1\ 4\ 0\ 10_!
  • 2/11 = 0.0\ 1\ 0\ 1\ 4\ 6\ 2\ 8\ 1\ 9_! = 0.0\ 1\ 0\ 1\ 4\ 6\ 2\ 8\ 1\ 8\ 11\ 12\ 13 ..._!
  • 9/11 = 0.1\ 1\ 3\ 3\ 1\ 0\ 5\ 0\ 8\ 2_!
  • 1/15 = 0.0\ 0\ 1\ 3_!
  • 237/360 = 0.1\ 0\ 3\ 4_!
There are also a small number of constants have patterned representations with this method:
  • e = 10.1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1\ 1..._!
  • e^{-1} = 0.0\ 2\ 0\ 4\ 0\ 6\ 0\ 8\ 0\ 10\ 0\ 12\ 0\ 14..._!
  • \sin(1) = 0.1\ 2\ 0\ 0\ 5\ 6\ 0\ 0\ 9\ 10\ 0\ 0\ 13\ 14..._!
  • \cos(1) = 0.1\ 0\ 0\ 4\ 5\ 0\ 0\ 8\ 9\ 0\ 0\ 12\ 13\ 0..._!
  • \sinh(1) = 1.0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0..._!
  • \cosh(1) = 1.1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1\ 0\ 1..._!

Decimal Notation

Decimal notation is the writing of numbers in a base-10 numeral system. Examples are Roman numerals, Brahmi numerals, and Chinese numerals, as well as the Hindu-Arabic numerals used by speakers of many European languages. Roman numerals have symbols for the decimal powers (1, 10, 100, 1000) and secondary symbols for half these values (5, 50, 500). Brahmi numerals have symbols for the nine numbers 1–9, the nine decades 10–90, plus a symbol for 100 and another for 1000. Chinese numerals have symbols for 1–9, and additional symbols for powers of 10, which in modern usage reach 1044.
However, when people who use Hindu-Arabic numerals speak of decimal notation, they often mean not just decimal numeration, as above, but also decimal fractions, all conveyed as part of a positional system. Positional decimal systems include a zero and use symbols (called digits) for the ten values (0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) to represent any number, no matter how large or how small. These digits are often used with a decimal separator which indicates the start of a fractional part, and with a symbol such as the plus sign + (for positive) or minus sign − (for negative) adjacent to the numeral to indicate whether it is greater or less than zero, respectively.
Positional notation uses positions for each power of ten: units, tens, hundreds, thousands, etc. The position of each digit within a number denotes the multiplier (power of ten) multiplied with that digit—each position has a value ten times that of the position to its right. There were at least two presumably independent sources of positional decimal systems in ancient civilization: the Chinese counting rod system and the Hindu-Arabic numeral system (the latter descended from Brahmi numerals).
Ten is the number which is the count of fingers and thumbs on both hands (or toes on the feet). The English word digit as well as its translation in many languages is also the anatomical term for fingers and toes. In English, decimal (decimus < Lat.) means tenth, decimate means reduce by a tenth, and denary (denarius < Lat.) means the unit of ten.
The symbols for the digits in common use around the globe today are called Arabic numerals by Europeans and Indian numerals by Arabs, the two groups' terms both referring to the culture from which they learned the system. However, the symbols used in different areas are not identical; for instance, Western Arabic numerals (from which the European numerals are derived) differ from the forms used by other Arab cultures.

Decimal fractions

A decimal fraction is a fraction whose denominator is a power of ten.
Decimal fractions are commonly expressed without a denominator, the decimal separator being inserted into the numerator (with leading zeros added if needed) at the position from the right corresponding to the power of ten of the denominator; e.g., 8/10, 83/100, 83/1000, and 8/10000 are expressed as 0.8, 0.83, 0.083, and 0.0008. In English-speaking, some Latin American and many Asian countries, a period (.) or raised period (·) is used as the decimal separator; in many other countries, particularly in Europe, a comma is used.
The integer part, or integral part of a decimal number is the part to the left of the decimal separator. (See also truncation.) The part from the decimal separator to the right is the fractional part. It is usual for a decimal number that consists only of a fractional part, (mathematically, a proper fraction), to have a leading zero in its notation (its numeral). This helps disambiguation between a decimal sign and other punctuation, and especially when the negative number sign is indicated, it helps visualize the sign of the numeral as a whole.
Trailing zeros after the decimal point are not necessary, although in science, engineering and statistics they can be retained to indicate a required precision or to show a level of confidence in the accuracy of the number: Although 0.080 and 0.08 are numerically equal, in engineering 0.080 suggests a measurement with an error of up to one part in two thousand (±0.0005), while 0.08 suggests a measurement with an error of up to one in two hundred (see significant figures).

Other rational numbers

Any rational number with a denominator whose only prime factors are 2 and/or 5 may be precisely expressed as a decimal fraction and has a finite decimal expansion.
1/2 = 0.5
1/20 = 0.05
1/5 = 0.2
1/50 = 0.02
1/4 = 0.25
1/40 = 0.025
1/25 = 0.04
1/8 = 0.125
1/125= 0.008
1/10 = 0.1
If the rational number's denominator has any prime factors other than 2 or 5, it cannot be expressed as a finite decimal fraction, and has a unique eventually repeating infinite decimal expansion.
1/3 = 0.333333… (with 3 repeating)
1/9 = 0.111111… (with 1 repeating)
100-1=99=9×11
1/11 = 0.090909… (with 09 repeating)
1000-1=9×111=27×37
1/27 = 0.037037037…
1/37 = 0.027027027…
1/111 = 0 .009009009…
also:
1/81= 0.012345679012… (with 012345679 repeating)
That a rational number must have a finite or recurring decimal expansion can be seen to be a consequence of the long division algorithm, in that there are only q-1 possible nonzero remainders on division by q, so that the recurring pattern will have a period less than q. For instance, to find 3/7 by long division:
     0.4 2 8 5 7 1 4 ...
7 ) 3.0 0 0 0 0 0 0 0
2 8 30/7 = 4 r 2
2 0
1 4 20/7 = 2 r 6
6 0
5 6 60/7 = 8 r 4
4 0
3 5 40/7 = 5 r 5
5 0
4 9 50/7 = 7 r 1
1 0
7 10/7 = 1 r 3
3 0
2 8 30/7 = 4 r 2
2 0
etc.
The converse to this observation is that every recurring decimal represents a rational number p/q. This is a consequence of the fact that the recurring part of a decimal representation is, in fact, an infinite geometric series which will sum to a rational number. For instance,
0.0123123123\cdots = \frac{123}{10000} \sum_{k=0}^\infty 0.001^k = \frac{123}{10000}\ \frac{1}{1-0.001} = \frac{123}{9990} = \frac{41}{3330}

Real numbers

Every real number has a (possibly infinite) decimal representation; i.e., it can be written as
 x = \mathop{\rm sign}(x) \sum_{i\in\mathbb Z} a_i\,10^i
where
  • sign() is the sign function, and
  • ai ∈ { 0,1,…,9 } for all iZ are its decimal digits, equal to zero for all i greater than some number (that number being the common logarithm of |x|).
Such a sum converges as i increases, even if there are infinitely many non-zero ai.
Rational numbers (e.g., p/q) with prime factors in the denominator other than 2 and 5 (when reduced to simplest terms) have a unique recurring decimal representation.

Non-uniqueness of decimal representation

Consider those rational numbers which have only the factors 2 and 5 in the denominator, i.e., which can be written as p/(2a5b). In this case there is a terminating decimal representation. For instance, 1/1 = 1, 1/2 = 0.5, 3/5 = 0.6, 3/25 = 0.12 and 1306/1250 = 1.0448. Such numbers are the only real numbers which do not have a unique decimal representation, as they can also be written as a representation that has a recurring 9, for instance 1 = 0.99999…, 1/2 = 0.499999…, etc. The number 0 = 0/1 is special in that it has no representation with recurring 9.
This leaves the irrational numbers. They also have unique infinite decimal representations, and can be characterised as the numbers whose decimal representations neither terminate nor recur.
So in general the decimal representation is unique, if one excludes representations that end in a recurring 9.
The same trichotomy holds for other base-n positional numeral systems:
  • Terminating representation: rational where the denominator divides some nk
  • Recurring representation: other rational
  • Non-terminating, non-recurring representation: irrational
A version of this even holds for irrational-base numeration systems, such as golden mean base representation.

Decimal computation

Decimal computation was/is carried out in ancient times in many ways, typically in Rod calculus, on sand tables or with a variety of abaci.
Modern computer hardware and software systems commonly use a binary representation internally (although many early computers, such as the ENIAC or the IBM 650, used decimal representation internally). For external use by computer specialists, this binary representation is sometimes presented in the related octal or hexadecimal systems.
For most purposes, however, binary values are converted to or from the equivalent decimal values for presentation to or input from humans; computer programs express literals in decimal by default. (123.1, for example, is written as such in a computer program, even though many computer languages are unable to encode that number precisely.)
Both computer hardware and software also use internal representations which are effectively decimal for storing decimal values and doing arithmetic. Often this arithmetic is done on data which are encoded using some variant of binary-coded decimal, especially in database implementations, but there are other decimal representations in use (such as in the new IEEE 754 Standard for Floating-Point Arithmetic).
Decimal arithmetic is used in computers so that decimal fractional results can be computed exactly, which is not possible using a binary fractional representation. This is often important for financial and other calculations.

History

Many ancient cultures calculated from early on with numerals based on ten: Egyptian hieroglyphs, in evidence since around 3000 BC, used a purely decimal system, just as the Cretan hieroglyphs (ca. 1625−1500 BC) of the Minoans whose numerals are closely based on the Egyptian model. The decimal system was handed down to the consecutive Bronze Age cultures of Greece, including Linear A (ca. 18th century BC−1450 BC) and Linear B (ca. 1375−1200 BC) — the number system of classical Greece also used powers of ten, including, like the Roman numerals did, an intermediate base of 5. Notably, the polymath Archimedes (c. 287–212 BC) invented a decimal positional system in his Sand Reckoner which was based on 108 and later led the German mathematician Carl Friedrich Gauss to lament what heights science would have already reached in his days if Archimedes had fully realized the potential of his ingenious discovery. The Hittites hieroglyphs (since 15th century BC), just like the Egyptian and early numerals in Greece, was strictly decimal.
The Egyptian hieratic numerals, the Greek alphabet numerals, the Roman numerals, the Chinese numerals and early Indian Brahmi numerals are all non-positional decimal systems, and required large numbers of symbols. For instance, Egyptian numerals used different symbols for 10, 20, through 90, 100, 200, through 900, 1000, 2000, 3000, 4000, to 10,000.

History of decimal fractions


counting rod decimal fraction 1/7
According to Joseph Needham, decimal fractions were first developed and used by the Chinese in the 1st century BC, and then spread to the Middle East and from there to Europe.The written Chinese decimal fractions were non-positional. However, counting rod fractions were positional.
Qin Jiushao in his book Mathematical Treatise in Nine Sections (1247) denoted 0.96644 by
Counting rod 0.pngCounting rod h9 num.pngCounting rod v6.pngCounting rod h6.pngCounting rod v4.pngCounting rod h4.png, meaning
096644

The Jewish mathematician Immanuel Bonfils invented decimal fractions around 1350, anticipating Simon Stevin, but did not develop any notation to represent them.
The Persian mathematician Jamshīd al-Kāshī claimed to have discovered decimal fractions himself in the 15th century, though J. Lennart Berggren notes that positional decimal fractions were used five centuries before him by Arab mathematician Abu'l-Hasan al-Uqlidisi as early as the 10th century.
Khwarizmi introduced fractions to Islamic countries in the early 9th century. . This form of fraction with the numerator on top and the denominator on the bottom, without a horizontal bar, was also used in the 10th century by Abu'l-Hasan al-Uqlidisi and again in the 15th century work "Arithmetic Key" by Jamshīd al-Kāshī.[citation needed]
Stevin-decimal notation.png
A forerunner of modern European decimal notation was introduced by Simon Stevin in the 16th century.

Natural languages

Telugu language uses a straightforward decimal system. Other Dravidian languages such as Tamil and Malayalam have replaced the number nine tondu with 'onpattu' ("one to ten") during the early Middle Ages, while Telugu preserved the number nine as tommidi.
The Hungarian language also uses a strightforward decimal system. Unlike many indo-european languages, there is no exception between 10 and 20 (e.g. 11 is expressed as "tízenegy" litterarly "one on ten") or between 20-100 (23 as "huszonhárom" = "three on twenty") and the ones never precede the tens (like in german, e.g. "dreiundzwanzig" = 23).
A straightforward decimal rank system with a word for each order 10十,100百,1000千,10000万, and in which 11 is expressed as ten-one and 23 as two-ten-three, and 89345 is expressed as 8 (ten thousands) 万9 (thousand) 千3 (hundred) 百4 (tens) 十 5 is found in Chinese languages, and in Vietnamese with a few irregularities. Japanese, Korean, and Thai have imported the Chinese decimal system. Many other languages with a decimal system have special words for the numbers between 10 and 20, and decades. For example in English 11 is "eleven" not "ten-one".
Incan languages such as Quechua and Aymara have an almost straightforward decimal system, in which 11 is expressed as ten with one and 23 as two-ten with three.
Some psychologists suggest irregularities of the English names of numerals may hinder children's counting ability.

 

Newer Posts Older Posts Home