Arnoux-Rauzy-Poincaré sequences
26 février 2015 | Catégories: sage | View CommentsIn a recent article with Valérie Berthé [BL15], we provided a multidimensional continued fraction algorithm called Arnoux-Rauzy-Poincaré (ARP) to construct, given any vector \(v\in\mathbb{R}_+^3\), an infinite word \(w\in\{1,2,3\}^\mathbb{N}\) over a three-letter alphabet such that the frequencies of letters in \(w\) exists and are equal to \(v\) and such that the number of factors (i.e. finite block of consecutive letters) of length \(n\) appearing in \(w\) is linear and less than \(\frac{5}{2}n+1\). We also conjecture that for almost all \(v\) the contructed word describes a discrete path in the positive octant staying at a bounded distance from the euclidean line of direction \(v\).
In Sage, you can construct this word using the next version of my package slabbe-0.2 (not released yet, email me to press me to finish it). The one with frequencies of letters proportionnal to \((1, e, \pi)\) is:
sage: from slabbe.mcf import algo sage: D = algo.arp.substitutions() sage: it = algo.arp.coding_iterator((1,e,pi)) sage: w = words.s_adic(it, repeat(1), D) word: 1232323123233231232332312323123232312323...
The factor complexity is close to 2n+1 and the balance is often less or equal to three:
sage: w[:10000].number_of_factors(100) 202 sage: w[:100000].number_of_factors(1000) 2002 sage: w[:1000].balance() 3 sage: w[:2000].balance() 3
Note that bounded distance from the euclidean line almost surely was proven in [DHS2013] for Brun algorithm, another MCF algorithm.
Other approaches: Standard model and billiard sequences
Other approaches have been proposed to construct such discrete lines.
One of them is the standard model of Eric Andres [A03]. It is also equivalent to billiard sequences in the cube. It is well known that the factor complexity of billiard sequences is quadratic \(p(n)=n^2+n+1\) [AMST94]. Experimentally, we can verify this. We first create a billiard word of some given direction:
sage: from slabbe import BilliardCube sage: v = vector(RR, (1, e, pi)) sage: b = BilliardCube(v) sage: b Cubic billiard of direction (1.00000000000000, 2.71828182845905, 3.14159265358979) sage: w = b.to_word() sage: w word: 3231232323123233213232321323231233232132...
We create some prefixes of \(w\) that we represent internally as char*. The creation is slow because the implementation of billiard words in my optional package is in Python and is not that efficient:
sage: p3 = Word(w[:10^3], alphabet=[1,2,3], datatype='char') sage: p4 = Word(w[:10^4], alphabet=[1,2,3], datatype='char') # takes 3s sage: p5 = Word(w[:10^5], alphabet=[1,2,3], datatype='char') # takes 32s sage: p6 = Word(w[:10^6], alphabet=[1,2,3], datatype='char') # takes 5min 20s
We see below that exactly \(n^2+n+1\) factors of length \(n<20\) appears in the prefix of length 1000000 of \(w\):
sage: A = ['n'] + range(30) sage: c3 = ['p_(w[:10^3])(n)'] + map(p3.number_of_factors, range(30)) sage: c4 = ['p_(w[:10^4])(n)'] + map(p4.number_of_factors, range(30)) sage: c5 = ['p_(w[:10^5])(n)'] + map(p5.number_of_factors, range(30)) # takes 4s sage: c6 = ['p_(w[:10^6])(n)'] + map(p6.number_of_factors, range(30)) # takes 49s sage: ref = ['n^2+n+1'] + [n^2+n+1 for n in range(30)] sage: T = table(columns=[A,c3,c4,c5,c6,ref]) sage: T n p_(w[:10^3])(n) p_(w[:10^4])(n) p_(w[:10^5])(n) p_(w[:10^6])(n) n^2+n+1 +----+-----------------+-----------------+-----------------+-----------------+---------+ 0 1 1 1 1 1 1 3 3 3 3 3 2 7 7 7 7 7 3 13 13 13 13 13 4 21 21 21 21 21 5 31 31 31 31 31 6 43 43 43 43 43 7 52 55 56 57 57 8 63 69 71 73 73 9 74 85 88 91 91 10 87 103 107 111 111 11 100 123 128 133 133 12 115 145 151 157 157 13 130 169 176 183 183 14 144 195 203 211 211 15 160 223 232 241 241 16 176 253 263 273 273 17 192 285 296 307 307 18 208 319 331 343 343 19 224 355 368 381 381 20 239 392 407 421 421 21 254 430 448 463 463 22 268 470 491 507 507 23 282 510 536 553 553 24 296 552 583 601 601 25 310 596 632 651 651 26 324 642 683 703 703 27 335 687 734 757 757 28 345 734 787 813 813 29 355 783 842 871 871
Billiard sequences generate paths that are at a bounded distance from an euclidean line. This is equivalent to say that the balance is finite. The balance is defined as the supremum value of difference of the number of apparition of a letter in two factors of the same length. For billiard sequences, the balance is 2:
sage: p3.balance() 2 sage: p4.balance() # takes 2min 37s 2
Other approaches: Melançon and Reutenauer
Melançon and Reutenauer [MR13] also suggested a method that generalizes Christoffel words in higher dimension. The construction is based on the application of two substitutions generalizing the construction of sturmian sequences. Below we compute the factor complexity and the balance of some of their words over a three-letter alphabet.
On a three-letter alphabet, the two morphisms are:
sage: L = WordMorphism('1->1,2->13,3->2') sage: R = WordMorphism('1->13,2->2,3->3') sage: L WordMorphism: 1->1, 2->13, 3->2 sage: R WordMorphism: 1->13, 2->2, 3->3
Example 1: periodic case \(LRLRLRLRLR\dots\). In this example, the factor complexity seems to be around \(p(n)=2.76n\) and the balance is at least 28:
sage: from itertools import repeat, cycle sage: W = words.s_adic(cycle((L,R)),repeat('1')) sage: W word: 1213122121313121312212212131221213131213... sage: map(W[:10000].number_of_factors, [10,20,40,80]) [27, 54, 110, 221] sage: [27/10., 54/20., 110/40., 221/80.] [2.70000000000000, 2.70000000000000, 2.75000000000000, 2.76250000000000] sage: W[:1000].balance() # takes 1.6s 21 sage: W[:2000].balance() # takes 6.4s 28
Example 2: \(RLR^2LR^4LR^8LR^{16}LR^{32}LR^{64}LR^{128}\dots\) taken from the conclusion of their article. In this example, the factor complexity seems to be \(p(n)=3n\) and balance at least as high (=bad) as \(122\):
sage: W = words.s_adic([R,L,R,R,L,R,R,R,R,L]+[R]*8+[L]+[R]*16+[L]+[R]*32+[L]+[R]*64+[L]+[R]*128,'1') sage: W.length() 330312 sage: map(W.number_of_factors, [10, 20, 100, 200, 300, 1000]) [29, 57, 295, 595, 895, 2981] sage: [29/10., 57/20., 295/100., 595/200., 895/300., 2981/1000.] [2.90000000000000, 2.85000000000000, 2.95000000000000, 2.97500000000000, 2.98333333333333, 2.98100000000000] sage: W[:1000].balance() # takes 1.6s 122 sage: W[:2000].balance() # takes 6s 122
Example 3: some random ones. The complexity \(p(n)/n\) occillates between 2 and 3 for factors of length \(n=1000\) in prefixes of length 100000:
sage: for _ in range(10): ....: W = words.s_adic([choice((L,R)) for _ in range(50)],'1') ....: print W[:100000].number_of_factors(1000)/1000. 2.02700000000000 2.23600000000000 2.74000000000000 2.21500000000000 2.78700000000000 2.52700000000000 2.85700000000000 2.33300000000000 2.65500000000000 2.51800000000000
For ten randomly generated words, the balance goes from 6 to 27 which is much more than what is obtained for billiard words or by our approach:
sage: for _ in range(10): ....: W = words.s_adic([choice((L,R)) for _ in range(50)],'1') ....: print W[:1000].balance(), W[:2000].balance() 12 15 8 24 14 14 5 11 17 17 14 14 6 6 19 27 9 16 12 12
References
[BL15] | V. Berthé, S. Labbé, Factor Complexity of S-adic words generated by the Arnoux-Rauzy-Poincaré Algorithm, Advances in Applied Mathematics 63 (2015) 90-130. http://dx.doi.org/10.1016/j.aam.2014.11.001 |
[DHS2013] | Delecroix, Vincent, Tomás Hejda, and Wolfgang Steiner. “Balancedness of Arnoux-Rauzy and Brun Words.” In Combinatorics on Words, 119–31. Springer, 2013. http://link.springer.com/chapter/10.1007/978-3-642-40579-2_14. |
[A03] | E. Andres, Discrete linear objects in dimension n: the standard model, Graphical Models 65 (2003) 92-111. |
[AMST94] | P. Arnoux, C. Mauduit, I. Shiokawa, J. I. Tamura, Complexity of sequences defined by billiards in the cube, Bull. Soc. Math. France 122 (1994) 1-12. |
[MR13] | G. Melançon, C. Reutenauer, On a class of Lyndon words extending Christoffel words and related to a multidimensional continued fraction algorithm. J. Integer Seq. 16, No. 9, Article 13.9.7, 30 p., electronic only (2013). https://cs.uwaterloo.ca/journals/JIS/VOL16/Reutenauer/reut3.html |