How to write your own classes for coding theory ¶

Author: David Lucas (2016) for initial version. Marketa Slukova (2019) for this version.

This tutorial, designed for advanced users who want to build their own classes, will explain step by step what you need to do to write code which integrates well in the framework of coding theory. During this tutorial, we will cover the following parts:

how to write a new code family
how to write a new encoder
how to write a new decoder
how to write a new channel

This tutorial focuses on the example of implementation of the repetition code. At the end of each part, we will summarize every important step of the implementation. If one just wants a quick access to the implementation of one of the objects cited above, one can jump directly to the end of related part, which presents a summary of what to do.

I. Abstract classes ¶

There is a number of abstract classes representing different types of codes in the coding module of Sage. Depending on the type of code you want to implement, you should inherit from different abstract classes.

The most generic class is sage.coding.abstract_code:AbstractCode. This class makes no assumptions about linearity, metric, finiteness or the number of alphabets. The abstract notion of “code” that is implicitly used for this class is any enumerable subset of a cartesian product \(A_1 \times A_2 \times \ldots \times A_n\) for some sets \(A_i\).

If, for example, you want to create a non-linear code family, this is the class you should inherit from.

We give a small example of creating a non-linear code family. Take the code consisting of the following \(l\) words:

\(\{00 \ldots 00, 10 \ldots 00, 11 \ldots 00, \ldots , 11 \ldots 10, 11 \ldots 11 \}\).

Here is how we can implement it:

Sage

sage: from sage.coding.abstract_code import AbstractCode
sage: class ExampleCodeFamily(AbstractCode):
....:     def __init__(self, length):
....:         super(ExampleCodeFamily, self).__init__(length)
....:     def __iter__(self):
....:         for i in range(self.length() + 1):
....:             yield vector([1 for j in range(i)] + [0 for k in range(i, self.length())])
....:     def __contains__(self, word):
....:         return word in list(self)
....:     def _repr_(self):
....:         return "Dummy code of length {}".format(self.length())

sage: C = ExampleCodeFamily(4) # check that this works.
sage: C.list()
[(0, 0, 0, 0), (1, 0, 0, 0), (1, 1, 0, 0), (1, 1, 1, 0), (1, 1, 1, 1)]

Python

>>> from sage.all import *
>>> from sage.coding.abstract_code import AbstractCode
>>> class ExampleCodeFamily(AbstractCode):
...     def __init__(self, length):
...         super(ExampleCodeFamily, self).__init__(length)
...     def __iter__(self):
...         for i in range(self.length() + Integer(1)):
...             yield vector([Integer(1) for j in range(i)] + [Integer(0) for k in range(i, self.length())])
...     def __contains__(self, word):
...         return word in list(self)
...     def _repr_(self):
...         return "Dummy code of length {}".format(self.length())

>>> C = ExampleCodeFamily(Integer(4)) # check that this works.
>>> C.list()
[(0, 0, 0, 0), (1, 0, 0, 0), (1, 1, 0, 0), (1, 1, 1, 0), (1, 1, 1, 1)]

Focusing on linear codes, the most generic representative is the class sage.coding.linear_code_no_metric:AbstractLinearCodeNoMetric which contains all the methods that all linear codes share regardless of what their metric is. If you want to implement a linear code over some metric which has not been implemented in Sage yet, this is the class you should be inheriting from.

We have two metric specific abstract linear code classes, sage.coding.linear_code:AbstractLinearCode for Hamming metric and sage.coding.linear_rank_metric:AbstractLinearRankMetricCode for rank metric. If you wish to implement a linear code class over one of these metrics, you should inherit from the given abstract class.

II. The repetition code ¶

We want to implement in Sage the well-known repetition code. Its definition follows:

the \((n, 1)\)-repetition code over \(\GF{q}\) is the code formed by all the vectors of \(\GF{q}^{n}\) of the form \((i, i, i, \dots, i)\) for all \(i \in \GF{q}\).

For example, the \((3, 1)\)-repetition code over \(\GF{2}\) is: \(C = \{(0, 0, 0), (1, 1, 1)\}\).

The encoding is very simple, it only consists in repeating \(n\) times the input symbol and pick the vector thus formed.

The decoding uses majority voting to select the right symbol (over \(\GF{2}\)). If we receive the word \((1, 0, 1)\) (example cont’d), we deduce that the original word was \((1)\). It can correct up to \(\left\lceil \frac{n-1}{2} \right\rceil\) errors.

Through all this tutorial, we will illustrate the implementation of the \((n, 1)\)-repetition code over \(\GF{2}\).

III. Write a new code class ¶

The first thing to do to write a new code class is to identify which abstract class to inherit from. Since the repetition code is linear and we take it over the Hamming metric, this means that we will inherit from sage.coding.linear_code.AbstractLinearCode.

Now we have to identify the initializing parameters of this class, which are:

the length of the code,
the base field of the code,
the default encoder for the code,
the default decoder for the code and
any other useful argument we want to set at construction time.

For our code, we know its length, its dimension, its base field, one encoder and one decoder.

Now we isolated the parameters of the code, we can write the constructor of our class. As we said, every linear code class over the Hamming metric must inherit from sage.coding.linear_code.AbstractLinearCode. This class provides a lot of useful methods and, as we illustrate thereafter, a default constructor which sets the length, the base field, the default encoder and the default decoder as class parameters. We also need to create the dictionary of known encoders and decoders for the class.

Let us now write the constructor for our code class, that we store in some file called repetition_code.py:

Sage

sage: from sage.coding.linear_code import AbstractLinearCode
sage: from sage.rings.finite_rings.finite_field_constructor import FiniteField as GF
sage: class BinaryRepetitionCode(AbstractLinearCode):
....:     _registered_encoders = {}
....:     _registered_decoders = {}
....:     def __init__(self, length):
....:         super(BinaryRepetitionCode, self).__init__(GF(2), length,
....:           "RepetitionGeneratorMatrixEncoder", "MajorityVoteDecoder")
....:         self._dimension = 1

Python

>>> from sage.all import *
>>> from sage.coding.linear_code import AbstractLinearCode
>>> from sage.rings.finite_rings.finite_field_constructor import FiniteField as GF
>>> class BinaryRepetitionCode(AbstractLinearCode):
...     _registered_encoders = {}
...     _registered_decoders = {}
...     def __init__(self, length):
...         super(BinaryRepetitionCode, self).__init__(GF(Integer(2)), length,
...           "RepetitionGeneratorMatrixEncoder", "MajorityVoteDecoder")
...         self._dimension = Integer(1)

As you notice, the constructor is really simple. Most of the work is indeed managed by the topclass through the super statement. Note that the dimension is not set by the abstract class, because for some code families the exact dimension is hard to compute. If the exact dimension is known, set it using _dimension as a class parameter.

We can now write representation methods for our code class:

Sage

sage: def _repr_(self):
....:     return "Binary repetition code of length %s" % self.length()
sage: def _latex_(self):
....:     return "\textnormal{Binary repetition code of length } %s" % self.length()

Python

>>> from sage.all import *
>>> def _repr_(self):
...     return "Binary repetition code of length %s" % self.length()
>>> def _latex_(self):
...     return "\textnormal{Binary repetition code of length } %s" % self.length()

We also write a method to check equality:

Sage

sage: def __eq__(self, other):
....:     return (isinstance(other, BinaryRepetitionCode)
....:             and self.length() == other.length()
....:             and self.dimension() == other.dimension())

Python

>>> from sage.all import *
>>> def __eq__(self, other):
...     return (isinstance(other, BinaryRepetitionCode)
...             and self.length() == other.length()
...             and self.dimension() == other.dimension())

After these examples, you probably noticed that we use two methods, namely length() and dimension() without defining them. That is because their implementation is provided in parent classes of sage.coding.linear_code.AbstractLinearCode, which are sage.coding.linear_code_no_metric.AbstractLinearCodeNoMetric and sage.coding.abstract_code.AbstractCode

They provide a default implementation of the following getter methods:

They also provide several other useful methods, such as __contains__. Note that a lot of these other methods rely on the computation of a generator matrix. It is thus highly recommended to set an encoder which knows how to compute such a matrix as default encoder. As default encoder will be used by all these methods which expect a generator matrix, if one provides a default encoder which does not have a generator_matrix method, a lot of generic methods will fail.

As our code family is really simple, we do not need anything else, and the code provided above is enough to describe properly a repetition code.

Summary of the implementation for linear codes over the Hamming metric¶

Inherit from sage.coding.linear_code.AbstractLinearCode.
Add _registered_encoders = {} and _registered_decoders = {} as class variables.

Add this line in the class’ constructor:

super(ClassName, self).__init__(base_field, length, "DefaultEncoder", "DefaultDecoder")

Implement representation methods (not mandatory, but highly advised) _repr_ and _latex_.
Implement __eq__.
__ne__, length and dimension come with the abstract class.

Please note that dimension will not work is there is no field _dimension as class parameter.

We now know how to write a new code class. Let us see how to write a new encoder and a new decoder.

VI. Write a new channel class ¶

Alongside all these new structures directly related to codes, we also propose a whole new and shiny structure to experiment on codes, and more specifically on their decoding.

Indeed, we implemented a structure to emulate real-world communication channels.

I’ll propose here a step-by-step implementation of a dummy channel for example’s sake.

We will implement a very naive channel which works only for words over \(\GF{2}\) and flips as many bits as requested by the user.

As channels are not directly related to code families, but more to vectors and words, we have a specific file, channel.py to store them.

So we will just add our new class in this file.

For starters, we ask ourselves the eternal question: What do we need to describe a channel? Well, we mandatorily need its input_space and its output_space. Of course, in most of the cases, the user will be able to provide some extra information on the channel’s behaviour. In our case, it will be the number of bits to flip (aka the number of errors).

As you might have guess, there is an abstract class to take care of the mandatory arguments! Plus, in our case, as this channel only works for vectors over \(\GF{2}\), the input and output spaces are the same. Let us write the constructor of our new channel class:

Sage

sage: from sage.coding.channel import Channel
sage: class BinaryStaticErrorRateChannel(Channel):
....:     def __init__(self, space, number_errors):
....:         if space.base_ring() is not GF(2):
....:             raise ValueError("Provided space must be a vector space over GF(2)")
....:         if number_errors > space.dimension():
....:             raise ValueErrors("number_errors cannot be bigger than input space's dimension")
....:         super(BinaryStaticErrorRateChannel, self).__init__(space, space)
....:         self._number_errors = number_errors

Python

>>> from sage.all import *
>>> from sage.coding.channel import Channel
>>> class BinaryStaticErrorRateChannel(Channel):
...     def __init__(self, space, number_errors):
...         if space.base_ring() is not GF(Integer(2)):
...             raise ValueError("Provided space must be a vector space over GF(2)")
...         if number_errors > space.dimension():
...             raise ValueErrors("number_errors cannot be bigger than input space's dimension")
...         super(BinaryStaticErrorRateChannel, self).__init__(space, space)
...         self._number_errors = number_errors

Remember to inherit from sage.coding.channel.Channel!

We also want to override representation methods _repr_ and _latex_:

Sage

sage: def _repr_(self):
....:     return ("Binary static error rate channel creating %s errors, of input and output space %s"
....:             % (format_interval(no_err), self.input_space()))

sage: def _latex_(self):
....:     return ("\\textnormal{Static error rate channel creating %s errors, of input and output space %s}"
....:             % (format_interval(no_err), self.input_space()))

Python

>>> from sage.all import *
>>> def _repr_(self):
...     return ("Binary static error rate channel creating %s errors, of input and output space %s"
...             % (format_interval(no_err), self.input_space()))

>>> def _latex_(self):
...     return ("\\textnormal{Static error rate channel creating %s errors, of input and output space %s}"
...             % (format_interval(no_err), self.input_space()))

We don’t really see any use case for equality methods (__eq__ and __ne__) so do not provide any default implementation. If one needs these, one can of course override Python’s default methods.

We of course want getter methods. There is a provided default implementation for input_space and output_space, so we only need one for number_errors:

Sage

sage: def number_errors(self):
....:     return self._number_errors

Python

>>> from sage.all import *
>>> def number_errors(self):
...     return self._number_errors

So, now we want a method to actually add errors to words. As it is the same thing as transmitting messages over a real-world channel, we propose two methods, transmit and transmit_unsafe. As you can guess, transmit_unsafe tries to transmit the message without checking if it is in the input space or not, while transmit checks this before the transmission… Which means that transmit has a default implementation which calls transmit_unsafe. So we only need to override transmit_unsafe! Let us do it:

Sage

sage: def transmit_unsafe(self, message):
....:     w = copy(message)
....:     number_err = self.number_errors()
....:     V = self.input_space()
....:     F = GF(2)
....:     for i in sample(range(V.dimension()), number_err):
....:         w[i] += F.one()
....:     return w

Python

>>> from sage.all import *
>>> def transmit_unsafe(self, message):
...     w = copy(message)
...     number_err = self.number_errors()
...     V = self.input_space()
...     F = GF(Integer(2))
...     for i in sample(range(V.dimension()), number_err):
...         w[i] += F.one()
...     return w

That is it, we now have our new channel class ready to use!

Summary of the implementation for channels¶

Inherit from sage.coding.channel.Channel.

Add this line in the class’ constructor:

super(ClassName, self).__init__(input_space, output_space)

Implement representation methods (not mandatory) _repr_ and _latex_.
input_space and output_space getter methods come with the abstract class.
Override transmit_unsafe.

VII. Sort our new elements ¶

As there is many code families and channels in the coding theory library, we do not wish to store all our classes directly in Sage’s global namespace.

We propose several catalog files to store our constructions, namely:

codes_catalog.py,
encoders_catalog.py,
decoders_catalog.py and
channels_catalog.py.

Every time one creates a new object, it should be added in the dedicated catalog file instead of coding theory folder’s all.py.

Here it means the following:

add the following in codes_catalog.py:

from sage.coding.repetition_code import BinaryRepetitionCode

add the following in encoders_catalog.py:

from sage.coding.repetition_code import BinaryRepetitionCodeGeneratorMatrixEncoder

add the following in decoders_catalog.py:

from sage.coding.repetition_code import BinaryRepetitionCodeMajorityVoteDecoder

add the following in channels_catalog.py:

from sage.coding.channel import BinaryStaticErrorRateChannel

VIII. Complete code of this tutorial ¶

If you need some base code to start from, feel free to copy-paste and derive from the one that follows.

repetition_code.py (with two encoders):

from sage.coding.linear_code import AbstractLinearCode
from sage.coding.encoder import Encoder
from sage.coding.decoder import Decoder
from sage.rings.finite_rings.finite_field_constructor import FiniteField as GF

class BinaryRepetitionCode(AbstractLinearCode):

    _registered_encoders = {}
    _registered_decoders = {}

    def __init__(self, length):
        super(BinaryRepetitionCode, self).__init__(GF(2), length, "RepetitionGeneratorMatrixEncoder", "MajorityVoteDecoder")
        self._dimension = 1

    def _repr_(self):
        return "Binary repetition code of length %s" % self.length()

    def _latex_(self):
        return "\textnormal{Binary repetition code of length } %s" % self.length()

    def __eq__(self, other):
        return (isinstance(other, BinaryRepetitionCode)
           and self.length() == other.length()
           and self.dimension() == other.dimension())


class BinaryRepetitionCodeGeneratorMatrixEncoder(Encoder):

    def __init__(self, code):
        super(BinaryRepetitionCodeGeneratorMatrixEncoder, self).__init__(code)

    def _repr_(self):
        return "Binary repetition encoder for the %s" % self.code()

    def _latex_(self):
        return "\textnormal{Binary repetition encoder for the } %s" % self.code()

    def __eq__(self, other):
        return (isinstance(other, BinaryRepetitionCodeGeneratorMatrixEncoder)
           and self.code() == other.code())

    def generator_matrix(self):
        n = self.code().length()
        return Matrix(GF(2), 1, n, [GF(2).one()] * n)


class BinaryRepetitionCodeStraightforwardEncoder(Encoder):

    def __init__(self, code):
        super(BinaryRepetitionCodeStraightforwardEncoder, self).__init__(code)

    def _repr_(self):
        return "Binary repetition encoder for the %s" % self.code()

    def _latex_(self):
        return "\textnormal{Binary repetition encoder for the } %s" % self.code()

    def __eq__(self, other):
        return (isinstance(other, BinaryRepetitionCodeStraightforwardEncoder)
           and self.code() == other.code())

    def encode(self, message):
        return vector(GF(2), [message] * self.code().length())

    def unencode_nocheck(self, word):
        return word[0]

    def message_space(self):
        return GF(2)


class BinaryRepetitionCodeMajorityVoteDecoder(Decoder):

    def __init__(self, code):
        super(BinaryRepetitionCodeMajorityVoteDecoder, self).__init__(code, code.ambient_space(),
           "RepetitionGeneratorMatrixEncoder")

    def _repr_(self):
        return "Majority vote-based decoder for the %s" % self.code()

    def _latex_(self):
        return "\textnormal{Majority vote based-decoder for the } %s" % self.code()


    def __eq__(self, other):
        return (isinstance(other, BinaryRepetitionCodeMajorityVoteDecoder)
           and self.code() == other.code())

    def decode_to_code(self, word):
        list_word = word.list()
        count_one = list_word.count(GF(2).one())
        n = self.code().length()
        length = len(list_word)
        F = GF(2)
        if count_one > length / 2:
            return vector(F, [F.one()] * n)
        elif count_one < length / 2:
           return vector(F, [F.zero()] * n)
        else:
           raise DecodingError("impossible to find a majority")

    def decoding_radius(self):
        return (self.code().length()-1) // 2


BinaryRepetitionCode._registered_encoders["RepetitionGeneratorMatrixEncoder"] = BinaryRepetitionCodeGeneratorMatrixEncoder
BinaryRepetitionCode._registered_encoders["RepetitionStraightforwardEncoder"] = BinaryRepetitionCodeStraightforwardEncoder
BinaryRepetitionCode._registered_decoders["MajorityVoteDecoder"] = BinaryRepetitionCodeMajorityVoteDecoder
BinaryRepetitionCodeMajorityVoteDecoder._decoder_type = {"hard-decision", "unique"}

channel.py (continued):

class BinaryStaticErrorRateChannel(Channel):

    def __init__(self, space, number_errors):
        if space.base_ring() is not GF(2):
            raise ValueError("Provided space must be a vector space over GF(2)")
        if number_errors > space.dimension():
            raise ValueErrors("number_errors cannot be bigger than input space's dimension")
        super(BinaryStaticErrorRateChannel, self).__init__(space, space)
        self._number_errors = number_errors

    def _repr_(self):
      return ("Binary static error rate channel creating %s errors, of input and output space %s"
              % (format_interval(no_err), self.input_space()))

    def _latex_(self):
      return ("\\textnormal{Static error rate channel creating %s errors, of input and output space %s}"
              % (format_interval(no_err), self.input_space()))

    def number_errors(self):
      return self._number_errors

    def transmit_unsafe(self, message):
        w = copy(message)
        number_err = self.number_errors()
        V = self.input_space()
        F = GF(2)
        for i in sample(range(V.dimension()), number_err):
            w[i] += F.one()
        return w

codes_catalog.py (continued):

from sage.coding.repetition_code import BinaryRepetitionCode

encoders_catalog.py (continued):

from sage.coding.repetition_code import (BinaryRepetitionCodeGeneratorMatrixEncoder, BinaryRepetitionCodeStraightforwardEncoder)

decoders_catalog.py (continued):

from sage.coding.repetition_code import BinaryRepetitionCodeMajorityVoteDecoder

channels_catalog.py (continued):

from sage.coding.channel import (ErrorErasureChannel, StaticErrorRateChannel, BinaryStaticErrorRateChannel)

How to write your own classes for coding theory ¶

I. Abstract classes ¶

II. The repetition code ¶

III. Write a new code class ¶

Summary of the implementation for linear codes over the Hamming metric¶

IV. Write a new encoder class ¶

We have a generator matrix¶

We do not have any generator matrix¶

Summary of the implementation for encoders¶

V. Write a new decoder class ¶

Summary of the implementation for decoders¶

VI. Write a new channel class ¶

Summary of the implementation for channels¶

VII. Sort our new elements ¶

VIII. Complete code of this tutorial ¶