Bitsets#

A Python interface to the fast bitsets in Sage. Bitsets are fast binary sets that store elements by toggling bits in an array of numbers. A bitset can store values between \(0\) and capacity - 1, inclusive (where capacity is finite, but arbitrary). The storage cost is linear in capacity.

Warning

This class is most likely to be useful as a way to store Cython bitsets in Python data structures, acting on them using the Cython inline functions. If you want to use these classes for a Python set type, the Python set or frozenset data types may be faster.

class sage.data_structures.bitset.Bitset#

Bases: FrozenBitset

A bitset class which leverages inline Cython functions for creating and manipulating bitsets. See the class documentation of FrozenBitset for details on the parameters of the constructor and how to interpret the string representation of a Bitset.

A bitset can be thought of in two ways. First, as a set of elements from the universe of the \(n\) natural numbers \(0, 1, \dots, n-1\) (where the capacity \(n\) can be specified), with typical set operations such as intersection, union, symmetric difference, etc. Secondly, a bitset can be thought of as a binary vector with typical binary operations such as and, or, xor, etc. This class supports both interfaces.

The interface in this class mirrors the interface in the set data type of Python.

Warning

This class is most likely to be useful as a way to store Cython bitsets in Python data structures, acting on them using the Cython inline functions. If you want to use this class for a Python set type, the Python set data type may be faster.

See also

EXAMPLES:

sage: a = Bitset('1101')
sage: loads(dumps(a)) == a
True
sage: a = Bitset('1101' * 32)
sage: loads(dumps(a)) == a
True
add(n)#

Update the bitset by adding n.

EXAMPLES:

sage: a = Bitset('110')
sage: a.add(5)
sage: a
110001
sage: a.add(100)
sage: sorted(list(a))
[0, 1, 5, 100]
sage: a.capacity()
101
clear()#

Removes all elements from the bitset.

EXAMPLES:

sage: a = Bitset('011')
sage: a.clear()
sage: a
000
sage: a = Bitset('011' * 32)
sage: a.clear()
sage: set(a)
set()
difference_update(other)#

Update the bitset to the difference of self and other.

EXAMPLES:

sage: a = Bitset('110')
sage: a.difference_update(Bitset('0101'))
sage: a
1000
sage: a_set = set(a)
sage: a.difference_update(FrozenBitset('010101' * 10)); a
100000000000000000000000000000000000000000000000000000000000
sage: a_set.difference_update(FrozenBitset('010101' * 10))
sage: a_set == set(a)
True
sage: a.difference_update(FrozenBitset('110'))
sage: a_set.difference_update(FrozenBitset('110'))
sage: a_set == set(a)
True
sage: a.difference_update(FrozenBitset('01010' * 20)); a
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
sage: a_set.difference_update(FrozenBitset('01010' * 20))
sage: a_set == set(a)
True
sage: b = Bitset('10101' * 20)
sage: b_set = set(b)
sage: b.difference_update(FrozenBitset('1' * 5)); b
0000010101101011010110101101011010110101101011010110101101011010110101101011010110101101011010110101
sage: b_set.difference_update(FrozenBitset('1' * 5))
sage: b_set == set(b)
True
discard(n)#

Update the bitset by removing n.

EXAMPLES:

sage: a = Bitset('110')
sage: a.discard(1)
sage: a
100
sage: a.discard(2)
sage: a.discard(4)
sage: a
100
sage: a = Bitset('000001' * 15); sorted(list(a))
[5, 11, 17, 23, 29, 35, 41, 47, 53, 59, 65, 71, 77, 83, 89]
sage: a.discard(83); sorted(list(a))
[5, 11, 17, 23, 29, 35, 41, 47, 53, 59, 65, 71, 77, 89]
sage: a.discard(82); sorted(list(a))
[5, 11, 17, 23, 29, 35, 41, 47, 53, 59, 65, 71, 77, 89]
intersection_update(other)#

Update the bitset to the intersection of self and other.

EXAMPLES:

sage: a = Bitset('110')
sage: a.intersection_update(Bitset('0101'))
sage: a
0100
sage: a_set = set(a)
sage: a.intersection_update(Bitset('0110' * 25))
sage: a
0100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
sage: a_set.intersection_update(Bitset('0110' * 25))
sage: set(a) == a_set
True
pop()#

Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.

EXAMPLES:

sage: a = Bitset('011')
sage: a.pop()
1
sage: a
001
sage: a.pop()
2
sage: a
000
sage: a.pop()
Traceback (most recent call last):
...
KeyError: 'pop from an empty set'
sage: a = Bitset('0001'*32)
sage: a.pop()
3
sage: [a.pop() for _ in range(20)]
[7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63, 67, 71, 75, 79, 83]
remove(n)#

Update the bitset by removing n. Raises KeyError if n is not contained in the bitset.

EXAMPLES:

sage: a = Bitset('110')
sage: a.remove(1)
sage: a
100
sage: a.remove(2)
Traceback (most recent call last):
...
KeyError: 2
sage: a.remove(4)
Traceback (most recent call last):
...
KeyError: 4
sage: a
100
sage: a = Bitset('000001' * 15); sorted(list(a))
[5, 11, 17, 23, 29, 35, 41, 47, 53, 59, 65, 71, 77, 83, 89]
sage: a.remove(83); sorted(list(a))
[5, 11, 17, 23, 29, 35, 41, 47, 53, 59, 65, 71, 77, 89]
symmetric_difference_update(other)#

Update the bitset to the symmetric difference of self and other.

EXAMPLES:

sage: a = Bitset('110')
sage: a.symmetric_difference_update(Bitset('0101'))
sage: a
1001
sage: a_set = set(a)
sage: a.symmetric_difference_update(FrozenBitset('010101' * 10)); a
110001010101010101010101010101010101010101010101010101010101
sage: a_set.symmetric_difference_update(FrozenBitset('010101' * 10))
sage: a_set == set(a)
True
sage: a.symmetric_difference_update(FrozenBitset('01010' * 20)); a
1001011111000001111100000111110000011111000001111100000111110101001010010100101001010010100101001010
sage: a_set.symmetric_difference_update(FrozenBitset('01010' * 20))
sage: a_set == set(a)
True
sage: b = Bitset('10101' * 20)
sage: b_set = set(b)
sage: b.symmetric_difference_update( FrozenBitset('1' * 5)); b
0101010101101011010110101101011010110101101011010110101101011010110101101011010110101101011010110101
sage: b_set.symmetric_difference_update( FrozenBitset('1' * 5))
sage: b_set == set(b)
True
update(other)#

Update the bitset to include items in other.

EXAMPLES:

sage: a = Bitset('110')
sage: a.update(Bitset('0101'))
sage: a
1101
sage: a_set = set(a)
sage: a.update(Bitset('00011' * 25))
sage: a
11011000110001100011000110001100011000110001100011000110001100011000110001100011000110001100011000110001100011000110001100011
sage: a_set.update(Bitset('00011' * 25))
sage: set(a) == a_set
True
class sage.data_structures.bitset.FrozenBitset#

Bases: object

A frozen bitset class which leverages inline Cython functions for creating and manipulating bitsets.

A bitset can be thought of in two ways. First, as a set of elements from the universe of the \(n\) natural numbers \(0, 1, \dots, n-1\) (where the capacity \(n\) can be specified), with typical set operations such as intersection, union, symmetric difference, etc. Secondly, a bitset can be thought of as a binary vector with typical binary operations such as and, or, xor, etc. This class supports both interfaces.

The interface in this class mirrors the interface in the frozenset data type of Python. See the Python documentation on set types for more details on Python’s set and frozenset classes.

Warning

This class is most likely to be useful as a way to store Cython bitsets in Python data structures, acting on them using the Cython inline functions. If you want to use this class for a Python set type, the Python frozenset data type may be faster.

INPUT:

  • iter – initialization parameter (default: None). Valid input are:

    • Bitset and FrozenBitset – If this is a Bitset or FrozenBitset, then it is copied.

    • None – If None, then the bitset is set to the empty set.

    • string – If a nonempty string, then the bitset is initialized by including an element if the index of the string is 1. If the string is empty, then raise a ValueError.

    • iterable – If an iterable, then it is assumed to contain a list of nonnegative integers and those integers are placed in the set.

  • capacity – (default: None) The maximum capacity of the bitset. If this is not specified, then it is automatically calculated from the passed iterable. It must be at least one.

OUTPUT:

  • None.

The string representation of a FrozenBitset FB can be understood as follows. Let \(B = b_0 b_1 b_2 \cdots b_k\) be the string representation of the bitset FB, where each \(b_i \in \{0, 1\}\). We read the \(b_i\) from left to right. If \(b_i = 1\), then the nonnegative integer \(i\) is in the bitset FB. Similarly, if \(b_i = 0\), then \(i\) is not in FB. In other words, FB is a subset of \(\{0, 1, 2, \dots, k\}\) and the membership in FB of each \(i\) is determined by the binary value \(b_i\).

See also

EXAMPLES:

The default bitset, which has capacity 1:

sage: FrozenBitset()
0
sage: FrozenBitset(None)
0

Trying to create an empty bitset fails:

sage: FrozenBitset([])
Traceback (most recent call last):
...
ValueError: Bitsets must not be empty
sage: FrozenBitset(list())
Traceback (most recent call last):
...
ValueError: Bitsets must not be empty
sage: FrozenBitset(())
Traceback (most recent call last):
...
ValueError: Bitsets must not be empty
sage: FrozenBitset(tuple())
Traceback (most recent call last):
...
ValueError: Bitsets must not be empty
sage: FrozenBitset("")
Traceback (most recent call last):
...
ValueError: Bitsets must not be empty

We can create the all-zero bitset as follows:

sage: FrozenBitset(capacity=10)
0000000000
sage: FrozenBitset([], capacity=10)
0000000000

We can initialize a FrozenBitset with a Bitset or another FrozenBitset, and compare them for equality. As they are logically the same bitset, the equality test should return True. Furthermore, each bitset is a subset of the other.

sage: def bitcmp(a, b, c):  # custom function for comparing bitsets
....:     print(a == b == c)
....:     print((a <= b, b <= c, a <= c))
....:     print((a >= b, b >= c, a >= c))
....:     print((a != b, b != c, a != c))
sage: a = Bitset("1010110"); b = FrozenBitset(a); c = FrozenBitset(b)
sage: a; b; c
1010110
1010110
1010110
sage: a < b, b < c, a < c
(False, False, False)
sage: a > b, b > c, a > c
(False, False, False)
sage: bitcmp(a, b, c)
True
(True, True, True)
(True, True, True)
(False, False, False)

Try a random bitset:

sage: a = Bitset(randint(0, 1) for n in range(1, randint(1, 10^4)))
sage: b = FrozenBitset(a); c = FrozenBitset(b)
sage: bitcmp(a, b, c)
True
(True, True, True)
(True, True, True)
(False, False, False)

A bitset with a hard-coded bitstring:

sage: FrozenBitset('101')
101

For a string, only those positions with 1 would be initialized to 1 in the corresponding position in the bitset. All other characters in the string, including 0, are set to 0 in the resulting bitset.

sage: FrozenBitset('a')
0
sage: FrozenBitset('abc')
000
sage: FrozenBitset('abc1')
0001
sage: FrozenBitset('0abc1')
00001
sage: FrozenBitset('0abc10')
000010
sage: FrozenBitset('0a*c10')
000010

Represent the first 10 primes as a bitset. The primes are stored as a list and as a tuple. We then recover the primes from its bitset representation, and query the bitset for its length (how many elements it contains) and whether an element is in the bitset. Note that the length of a bitset is different from its capacity. The length counts the number of elements currently in the bitset, while the capacity is the number of elements that the bitset can hold.

sage: p = primes_first_n(10); p                                                 # needs sage.libs.pari
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
sage: tuple(p)                                                                  # needs sage.libs.pari
(2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
sage: F = FrozenBitset(p); F; FrozenBitset(tuple(p))                            # needs sage.libs.pari
001101010001010001010001000001
001101010001010001010001000001

Recover the primes from the bitset:

sage: for b in F:                                                               # needs sage.libs.pari
....:     print(b)
2
3
...
29
sage: list(F)                                                                   # needs sage.libs.pari
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Query the bitset:

sage: # needs sage.libs.pari
sage: len(F)
10
sage: len(list(F))
10
sage: F.capacity()
30
sage: s = str(F); len(s)
30
sage: 2 in F
True
sage: 1 in F
False

A random iterable, with all duplicate elements removed:

sage: L = [randint(0, 100) for n in range(1, randint(1, 10^4))]
sage: FrozenBitset(L) == FrozenBitset(list(set(L)))
True
sage: FrozenBitset(tuple(L)) == FrozenBitset(tuple(set(L)))
True
capacity()#

Return the size of the underlying bitset.

The maximum value that can be stored in the current underlying bitset is self.capacity() - 1.

EXAMPLES:

sage: FrozenBitset('11000').capacity()
5
sage: FrozenBitset('110' * 32).capacity()
96
sage: FrozenBitset(range(20), capacity=450).capacity()
450
complement()#

Return the complement of self.

EXAMPLES:

sage: ~FrozenBitset('10101')
01010
sage: ~FrozenBitset('11111'*10)
00000000000000000000000000000000000000000000000000
sage: x = FrozenBitset('10'*40)
sage: x == ~x
False
sage: x == ~~x
True
sage: x|(~x) == FrozenBitset('11'*40)
True
sage: ~x == FrozenBitset('01'*40)
True
difference(other)#

Return the difference of self and other.

EXAMPLES:

sage: FrozenBitset('10101').difference(FrozenBitset('11100'))
00001
sage: FrozenBitset('11111' * 10).difference(FrozenBitset('010101' * 10))
101010101010101010101010101010101010101010101010100000000000
intersection(other)#

Return the intersection of self and other.

EXAMPLES:

sage: FrozenBitset('10101').intersection(FrozenBitset('11100'))
10100
sage: FrozenBitset('11111' * 10).intersection(FrozenBitset('010101' * 10))
010101010101010101010101010101010101010101010101010000000000
isdisjoint(other)#

Test to see if self is disjoint from other.

EXAMPLES:

sage: FrozenBitset('11').isdisjoint(FrozenBitset('01'))
False
sage: FrozenBitset('01').isdisjoint(FrozenBitset('001'))
True
sage: FrozenBitset('00101').isdisjoint(FrozenBitset('110' * 35))
False
isempty()#

Test if the bitset is empty.

INPUT:

  • None.

OUTPUT:

  • True if the bitset is empty; False otherwise.

EXAMPLES:

sage: FrozenBitset().isempty()
True
sage: FrozenBitset([1]).isempty()
False
sage: FrozenBitset([], capacity=110).isempty()
True
sage: FrozenBitset(range(99)).isempty()
False
issubset(other)#

Test to see if self is a subset of other.

EXAMPLES:

sage: FrozenBitset('11').issubset(FrozenBitset('01'))
False
sage: FrozenBitset('01').issubset(FrozenBitset('11'))
True
sage: FrozenBitset('01').issubset(FrozenBitset('01' * 45))
True
issuperset(other)#

Test to see if self is a superset of other.

EXAMPLES:

sage: FrozenBitset('11').issuperset(FrozenBitset('01'))
True
sage: FrozenBitset('01').issuperset(FrozenBitset('11'))
False
sage: FrozenBitset('01').issuperset(FrozenBitset('10' * 45))
False
symmetric_difference(other)#

Return the symmetric difference of self and other.

EXAMPLES:

sage: FrozenBitset('10101').symmetric_difference(FrozenBitset('11100'))
01001
sage: FrozenBitset('11111' * 10).symmetric_difference(FrozenBitset('010101' * 10))
101010101010101010101010101010101010101010101010100101010101
union(other)#

Return the union of self and other.

EXAMPLES:

sage: FrozenBitset('10101').union(FrozenBitset('11100'))
11101
sage: FrozenBitset('10101' * 10).union(FrozenBitset('01010' * 10))
11111111111111111111111111111111111111111111111111
sage.data_structures.bitset.test_bitset(py_a, py_b, n)#

Test the Cython bitset functions so we can have some relevant doctests.

sage.data_structures.bitset.test_bitset_copy_flex(py_a)#
sage.data_structures.bitset.test_bitset_pop(py_a)#

Tests for the bitset_pop function.

sage.data_structures.bitset.test_bitset_remove(py_a, n)#

Test the bitset_remove function.

sage.data_structures.bitset.test_bitset_set_first_n(py_a, n)#

Test the bitset function set_first_n.

sage.data_structures.bitset.test_bitset_unpickle(data)#

This (artificially) tests pickling of bitsets across systems.

INPUT:

  • data – A tuple of data as would be produced by the internal, Cython-only, method bitset_pickle.

OUTPUT:

A list form of the bitset corresponding to the pickled data.

EXAMPLES:

We compare 64-bit and 32-bit encoding. Both should unpickle on any system:

sage: from sage.data_structures.bitset import test_bitset_unpickle
sage: test_bitset_unpickle((0, 100, 2, 8, (33, 6001)))
[0, 5, 64, 68, 69, 70, 72, 73, 74, 76]
sage: test_bitset_unpickle((0, 100, 4, 4, (33, 0, 6001, 0)))
[0, 5, 64, 68, 69, 70, 72, 73, 74, 76]