Christodoulakis SimpleAlgorithmforSo rtingtheFibon

50 %
50 %
Information about Christodoulakis SimpleAlgorithmforSo rtingtheFibon

Published on January 16, 2008

Author: Tommaso


Simple Algorithm for Sorting the Fibonacci String Rotations :  Simple Algorithm for Sorting the Fibonacci String Rotations Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan José Pinzón Ardila Our Goal:  Our Goal What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)? Relationship between different rotations of a Fibonacci string What is their lexicographic order? Side effect: we can deduce the symbol stored at any position of any Fibonacci string in constant time (without using , provided that the fn values are known) Fibonacci Strings & Numbers:  Fibonacci Strings & Numbers The n-th Fibonacci string Fn = Fn-1Fn-2 n≥2 F0=b, F1=a The n-th Fibonacci number fn = fn-1+fn-2 n≥2 f0=1, f1=1 F2 a = b F3 a = b a F4 a = b a a b F1 a = F0 b = f0 1 = f1 1 = f2 2 = f3 3 = f4 5 = Notation:  Notation The i-th rotation of a string where i is taken modulo n. rank(i,x) = the rank of Ri(x) rot(ρ,x) = the rotation whose rank is ρ 0 1 … i-1 i … n-1 x = Ri(x) = Burrows-Wheeler Transform (BWT):  Burrows-Wheeler Transform (BWT) M.Burrows and D.J.Wheeler. 1994 Purpose: to make a string more compressible BWT Algorithm: Create list of all rotations Sort them Output last symbol of every rotation Output the rank of the 0-th rotation BWT on Fibonacci Strings:  BWT on Fibonacci Strings F5 = abaababa, f5 = 8 R0(F5) a = b a a b a b a R1(F5) b = a a b a b a a R2(F5) a = a b a b a a b R3(F5) a = b a b a a b a R4(F5) b = a b a a b a a R5(F5) a = b a a b a a b R6(F5) b = a a b a a b a R7(F5) a = a b a a b a b R0(F5) a = b a a b a b a R1(F5) b = a a b a b a a R2(F5) a = a b a b a a b R3(F5) a = b a b a a b a R4(F5) b = a b a a b a a R5(F5) a = b a a b a a b R6(F5) b = a a b a a b a R7(F5) a = a b a a b a b Properties of Fibonacci Strings:  Properties of Fibonacci Strings The number of ‘b’ in Fn is fn-2 Proof: By induction. C.S.Iliopoulos, D.W.Moore and W.F.Smyth. 1997 Fn = Fn-2Fn-3…F1un, un = ba (n odd) un = ab (n even) Let’s call this the IMS formula. Similarities in Rotations:  Similarities in Rotations R0(Fn) differs from Rfn-2(Fn) in 2 symbols Proof: R0(Fn) = Fn-2Fn-3…F1un Rfn-2(Fn) = Fn-3…F1unFn-2 (1) R0(Fn) = Fn-1Fn-2 = Fn-3…F1un-1Fn-2 (2) Ri(Fn) differs from Ri+fn-2(Fn) in 2 symbols Proof: Ri(Fn) = Ri(R0(Fn)) Ri+fn-2(Fn) = Ri(Rfn-2(Fn)) Relative Order of Rotations:  Relative Order of Rotations Ri(Fn) < Ri+fn-2(Fn) for n odd, i  fn-1-1 Proof: R0(Fn) = Fn-3…F1un-1Fn-2 Rfn-2(Fn) = Fn-3…F1un Fn-2 For i=fn-1-1: Ri(Fn) = bFn-2Fn-3…F1a Ri+fn-2(Fn)= aFn-2Fn-3…F1b Similarly, Ri(Fn) > Ri+fn-2(Fn) for n even, i  fn-1-1 = Fn-3 … F1 ab Fn-2 = Fn-3 … F1 ba Fn-2 Sorted List of Rotations:  Sorted List of Rotations We proved (n odd): Ri(Fn) < Ri+fn-2(Fn) i  fn-1-1 (3) We will now prove that there is no j s.t. Ri(Fn) < Rj(Fn) < Ri+fn-2(Fn) Proof: (constructive) Start at i=fn-1 and construct the partial list Ri Ri+fn-2 Ri+2fn-2 Ri+3fn-2 … Ri+kfn-2 … for as long as i+kfn-2  fn-1-1 (mod fn)  kfn-1 I.e. the list is complete! Identify Rotation (i) by Rank (ρ):  Identify Rotation (i) by Rank (ρ) Therefore, for n odd: rot(ρ,Fn) = fn-1 = (ρfn-2-1) mod fn Similarly, for n even, the sorted list is constructed bottom-up giving rot(ρ,Fn) = (-(ρ+1)fn-2-1) mod fn +ρfn-2) mod fn ( Identify Rank (ρ) of a Rotation (i):  Identify Rank (ρ) of a Rotation (i) This is simply the inverse of the previous function n odd rank(i,Fn) = ((i+1)fn-2) mod fn n even rank(i,Fn) = ((i+1)fn-2-1) mod fn Symbols of Fibonacci Strings:  Symbols of Fibonacci Strings Fn[i] = ? Observe that Fn[i] = Ri(Fn)[0] In the sorted list of rotations, the first fn-1 rotations start with ‘a’, the rest with ‘b’ Thus Fn[i] can be deduced from rank(i,Fn) If rank(i,Fn) ≤ fn-1 then Fn[i]=a else b. BWT & Fibonacci ― The Quick Way:  BWT & Fibonacci ― The Quick Way The first fn-2 symbols of BWT are ‘b’ Proof: (n odd) We proved the first fn-2 rotations have index (ρ·fn-2-1)modfn for 0 ≤ ρ < fn-2 The last symbol of these rotations is Fn[ (ρ·fn-2-1 )modfn ] Which for 0 ≤ ρ < fn-2 is ‘b’ The next fn-1 symbols of BWT are ‘a’ Proof: Consequence of previous lemma +fn-1

Add a comment

Related presentations