Sequences of bounded integers#
This module provides BoundedIntegerSequence
, which implements
sequences of bounded integers and is for many (but not all) operations faster
than representing the same sequence as a Python tuple
.
The underlying data structure is similar to Bitset
,
which means that certain operations are implemented by using fast shift
operations from MPIR. The following boilerplate functions can be
cimported in Cython modules:
cdef bint biseq_init(biseq_t R, mp_size_t l, mp_size_t itemsize) except -1
Allocate memory for a bounded integer sequence of length
l
with items fitting initemsize
bits.cdef inline void biseq_dealloc(biseq_t S)
Deallocate the memory used by
S
.cdef bint biseq_init_copy(biseq_t R, biseq_t S)
Initialize
R
as a copy ofS
.cdef tuple biseq_pickle(biseq_t S)
Return a triple
(bitset_data, itembitsize, length)
definingS
.cdef bint biseq_unpickle(biseq_t R, tuple bitset_data, mp_bitcnt_t itembitsize, mp_size_t length) except -1
Initialise
R
from data returned bybiseq_pickle
.cdef bint biseq_init_list(biseq_t R, list data, size_t bound) except -1
Convert a list to a bounded integer sequence, which must not be allocated.
cdef inline Py_hash_t biseq_hash(biseq_t S)
Hash value for
S
.cdef inline bint biseq_richcmp(biseq_t S1, biseq_t S2, int op)
Comparison of
S1
andS2
. This takes into account the bound, the length, and the list of items of the two sequences.cdef bint biseq_init_concat(biseq_t R, biseq_t S1, biseq_t S2) except -1
Concatenate
S1
andS2
and write the result toR
. Does not test whether the sequences have the same bound!cdef inline bint biseq_startswith(biseq_t S1, biseq_t S2)
Is
S1=S2+something
? Does not check whether the sequences have the same bound!cdef mp_size_t biseq_contains(biseq_t S1, biseq_t S2, mp_size_t start) except -2
Return the position in
S1
ofS2
as a subsequence ofS1[start:]
, or-1
ifS2
is not a subsequence. Does not check whether the sequences have the same bound!cdef mp_size_t biseq_starswith_tail(biseq_t S1, biseq_t S2, mp_size_t start) except -2:
Return the smallest number
i
such that the bounded integer sequenceS1
starts with the sequenceS2[i:]
, wherestart <= i < S1.length
, or return-1
if no suchi
exists.cdef mp_size_t biseq_index(biseq_t S, size_t item, mp_size_t start) except -2
Return the position in
S
of the item inS[start:]
, or-1
ifS[start:]
does not contain the item.cdef size_t biseq_getitem(biseq_t S, mp_size_t index)
Return
S[index]
, without checking margins.cdef size_t biseq_getitem_py(biseq_t S, mp_size_t index)
Return
S[index]
as Pythonint
, without checking margins.cdef biseq_inititem(biseq_t S, mp_size_t index, size_t item)
Set
S[index] = item
, without checking margins and assuming thatS[index]
has previously been zero.cdef inline void biseq_clearitem(biseq_t S, mp_size_t index)
Set
S[index] = 0
, without checking margins.cdef bint biseq_init_slice(biseq_t R, biseq_t S, mp_size_t start, mp_size_t stop, mp_size_t step) except -1
Initialise
R
withS[start:stop:step]
.
AUTHORS:
Simon King, Jeroen Demeyer (2014-10): initial version (github issue #15820)
- class sage.data_structures.bounded_integer_sequences.BoundedIntegerSequence#
Bases:
object
A sequence of non-negative uniformly bounded integers.
INPUT:
bound
– non-negative integer. When zero, aValueError
will be raised. Otherwise, the given bound is replaced by the power of two that is at least the given bound.data
– a list of integers.
EXAMPLES:
We showcase the similarities and differences between bounded integer sequences and lists respectively tuples.
To distinguish from tuples or lists, we use pointed brackets for the string representation of bounded integer sequences:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: S = BoundedIntegerSequence(21, [2, 7, 20]); S <2, 7, 20>
Each bounded integer sequence has a bound that is a power of two, such that all its item are less than this bound:
sage: S.bound() 32 sage: BoundedIntegerSequence(16, [2, 7, 20]) Traceback (most recent call last): ... OverflowError: list item 20 larger than 15
Bounded integer sequences are iterable, and we see that we can recover the originally given list:
sage: L = [randint(0,31) for i in range(5000)] sage: S = BoundedIntegerSequence(32, L) sage: list(L) == L True
Getting items and slicing works in the same way as for lists:
sage: n = randint(0,4999) sage: S[n] == L[n] True sage: m = randint(0,1000) sage: n = randint(3000,4500) sage: s = randint(1, 7) sage: list(S[m:n:s]) == L[m:n:s] True sage: list(S[n:m:-s]) == L[n:m:-s] True
The
index()
method works different for bounded integer sequences and tuples or lists. If one asks for the index of an item, the behaviour is the same. But we can also ask for the index of a sub-sequence:sage: L.index(L[200]) == S.index(L[200]) True sage: S.index(S[100:2000]) # random 100
Similarly, containment tests work for both items and sub-sequences:
sage: S[200] in S True sage: S[200:400] in S True sage: S[200]+S.bound() in S False
Bounded integer sequences are immutable, and thus copies are identical. This is the same for tuples, but of course not for lists:
sage: T = tuple(S) sage: copy(T) is T True sage: copy(S) is S True sage: copy(L) is L False
Concatenation works in the same way for lists, tuples and bounded integer sequences:
sage: M = [randint(0,31) for i in range(5000)] sage: T = BoundedIntegerSequence(32, M) sage: list(S+T)==L+M True sage: list(T+S)==M+L True sage: (T+S == S+T) == (M+L == L+M) True
However, comparison works different for lists and bounded integer sequences. Bounded integer sequences are first compared by bound, then by length, and eventually by reverse lexicographical ordering:
sage: S = BoundedIntegerSequence(21, [4,1,6,2,7,20,9]) sage: T = BoundedIntegerSequence(51, [4,1,6,2,7,20]) sage: S < T # compare by bound, not length True sage: T < S False sage: S.bound() < T.bound() True sage: len(S) > len(T) True
sage: T = BoundedIntegerSequence(21, [0,0,0,0,0,0,0,0]) sage: S < T # compare by length, not lexicographically True sage: T < S False sage: list(T) < list(S) True sage: len(T) > len(S) True
sage: T = BoundedIntegerSequence(21, [4,1,5,2,8,20,9]) sage: T > S # compare by reverse lexicographic ordering... True sage: S > T False sage: len(S) == len(T) True sage: list(S) > list(T) # direct lexicographic ordering is different True
- bound()#
Return the bound of this bounded integer sequence.
All items of this sequence are non-negative integers less than the returned bound. The bound is a power of two.
EXAMPLES:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: S = BoundedIntegerSequence(21, [4,1,6,2,7,20,9]) sage: T = BoundedIntegerSequence(51, [4,1,6,2,7,20,9]) sage: S.bound() 32 sage: T.bound() 64
- index(other)#
The index of a given item or sub-sequence of
self
EXAMPLES:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: S = BoundedIntegerSequence(21, [4,1,6,2,6,20,9,0]) sage: S.index(6) 2 sage: S.index(5) Traceback (most recent call last): ... ValueError: 5 is not in sequence sage: S.index(BoundedIntegerSequence(21, [6, 2, 6])) 2 sage: S.index(BoundedIntegerSequence(21, [6, 2, 7])) Traceback (most recent call last): ... ValueError: not a sub-sequence
The bound of (sub-)sequences matters:
sage: S.index(BoundedIntegerSequence(51, [6, 2, 6])) Traceback (most recent call last): ... ValueError: not a sub-sequence sage: S.index(0) 7 sage: S.index(S.bound()) Traceback (most recent call last): ... ValueError: 32 is not in sequence
- list()#
Converts this bounded integer sequence to a list
NOTE:
A conversion to a list is also possible by iterating over the sequence.
EXAMPLES:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: L = [randint(0,26) for i in range(5000)] sage: S = BoundedIntegerSequence(32, L) sage: S.list() == list(S) == L True
The discussion at github issue #15820 explains why the following is a good test:
sage: (BoundedIntegerSequence(21, [0,0]) + BoundedIntegerSequence(21, [0,0])).list() [0, 0, 0, 0]
- maximal_overlap(other)#
Return
self
’s maximal trailing sub-sequence thatother
starts with.Return
None
if there is no overlap.EXAMPLES:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: X = BoundedIntegerSequence(21, [4,1,6,2,7,2,3]) sage: S = BoundedIntegerSequence(21, [0,0,0,0,0,0,0]) sage: T = BoundedIntegerSequence(21, [2,7,2,3,0,0,0,0,0,0,0,1]) sage: (X+S).maximal_overlap(T) <2, 7, 2, 3, 0, 0, 0, 0, 0, 0, 0> sage: print((X+S).maximal_overlap(BoundedIntegerSequence(21, [2,7,2,3,0,0,0,0,0,1]))) None sage: (X+S).maximal_overlap(BoundedIntegerSequence(21, [0,0])) <0, 0> sage: B1 = BoundedIntegerSequence(4,[1,2,3,2,3,2,3]) sage: B2 = BoundedIntegerSequence(4,[2,3,2,3,2,3,1]) sage: B1.maximal_overlap(B2) <2, 3, 2, 3, 2, 3>
- startswith(other)#
Tells whether
self
starts with a given bounded integer sequenceEXAMPLES:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: L = [randint(0,26) for i in range(5000)] sage: S = BoundedIntegerSequence(27, L) sage: L0 = L[:1000] sage: T = BoundedIntegerSequence(27, L0) sage: S.startswith(T) True sage: L0[-1] = (L0[-1] + 1) % 27 sage: T = BoundedIntegerSequence(27, L0) sage: S.startswith(T) False sage: L0[-1] = (L0[-1] - 1) % 27 sage: L0[0] = (L0[0] + 1) % 27 sage: T = BoundedIntegerSequence(27, L0) sage: S.startswith(T) False sage: L0[0] = (L0[0] - 1) % 27
The bounds of the sequences must be compatible, or
startswith()
returnsFalse
:sage: T = BoundedIntegerSequence(51, L0) sage: S.startswith(T) False
- sage.data_structures.bounded_integer_sequences.NewBISEQ(bitset_data, itembitsize, length)#
Helper function for unpickling of
BoundedIntegerSequence
.EXAMPLES:
sage: from sage.data_structures.bounded_integer_sequences import BoundedIntegerSequence sage: L = [randint(0,26) for i in range(5000)] sage: S = BoundedIntegerSequence(32, L) sage: loads(dumps(S)) == S # indirect doctest True