Packaging the Sage Library#
Modules, packages, distribution packages#
The Sage library consists of a large number of Python modules,
organized into a hierarchical set of packages that fill the namespace
sage. All source files are located in a subdirectory of the
SAGE_ROOT/src/sage/coding/code_bounds.pyprovides the module
the directory containing this file,
SAGE_ROOT/src/sage/coding/, thus provides the package
There is another notion of “package” in Python, the distribution
package (also known as a “distribution” or a “pip-installable
package”). Currently, the entire Sage library is provided by a
which is generated from the directory
Note that the distribution name is not required to be a Python
identifier. In fact, using dashes (
-) is preferred to underscores in
distribution names; setuptools and other parts of Python’s packaging
infrastructure normalize underscores to dashes. (Using dots in
distribution names, to indicate ownership by organizations, still
mentioned in PEP 423, appears to
have largely fallen out of favor, and we will not use it in the SageMath
A distribution that provides Python modules in the
sage.* namespace, say
sage.PAC.KAGE, should be named sagemath-DISTRI-BUTION.
The distribution sagemath-categories provides a small subset of the modules of the Sage library, mostly from the packages
Other distributions should not use the prefix sagemath- in the distribution name. Example:
The distribution sage-sws2rst provides the Python package
sage_sws2rst, so it does not fill the
sage.*namespace and therefore does not use the prefix sagemath-.
A distribution that provides functionality that does not need to
import anything from the
sage namespace should not use the
sage namespace for its own packages/modules. It should be
positioned as part of the general Python ecosystem instead of as a
Sage-specific distribution. Examples:
The distribution pplpy provides the Python package
ppland is a much extended version of what used to be
sage.libs.ppl, a part of the Sage library. The package
sage.libs.pplhad dependencies on
sage.ringsto convert to/from Sage number types. pplpy has no such dependencies and is therefore usable in a wider range of Python projects.
The distribution memory-allocator provides the Python package
memory_allocator. This used to be
sage.ext.memory_allocator, a part of the Sage library.
Ordinary packages vs. implicit namespace packages#
Each module of the Sage library must be packaged in exactly one distribution
package. However, modules in a package may be included in different
distribution packages. In this regard, there is an important constraint that an
ordinary package (directory with
__init__.py file) cannot be split into
more than one distribution package.
By removing the
__init__.py file, however, we can make the package an
“implicit” (or “native”) “namespace” package, following
PEP 420. Implicit namespace packages can be
included in more than one distribution package. Hence whenever there are two
distribution packages that provide modules with a common prefix of Python
packages, that prefix needs to be a implicit namespace package, i.e., there
cannot be an
sagemath-tdlib will provide
sagemath-rw will provide
sagemath-graphs will provide all of the rest of
sage.graphs.graph_decompositions(and most of
Then, none of
can be an ordinary package (with an
__init__.py file), but rather
each of them has to be an implicit namespace package (no
For an implicit namespace package,
__init__.py cannot be used any more for
initializing the package.
In the Sage 9.6 development cycle, we still use ordinary packages by default, but several packages are converted to implicit namespace packages to support modularization.
Source directories of distribution packages#
The development of the Sage library uses a monorepo strategy for
all distribution packages that fill the
sage.* namespace. This
means that the source trees of these distributions are included in a
git repository, in a subdirectory of
All these distribution packages have matching version numbers. From the viewpoint of a single distribution, this means that sometimes there will be a new release of some distribution where the only thing changing is the version number.
The source directory of a distribution package, such as
SAGE_ROOT/pkgs/sagemath-standard, contains the following files:
sage– a relative symbolic link to the monolithic Sage library source tree
MANIFEST.in – controls which files and directories of the monolithic Sage library source tree are included in the distribution
pyproject.toml, setup.cfg, and requirements.txt – standard Python packaging metadata, declaring the distribution name, dependencies, etc.
README.rst– a description of the distribution
LICENSE.txt– relative symbolic link to the same files in
VERSION.txt– package version. This file is updated by the release manager by running the
Sometimes it may be necessary to upload a hotfix for a distribution package to PyPI. These should be marked by adding a suffix
.post2; see PEP 440 on post-releases. For example, if the current development release is
9.7.beta8, then such a version could be marked
Also sometimes when working on tickets it may be necessary to increment the version because a new feature is needed in another distribution package. Such versions should be marked by using the version number of the anticipated next development release and adding a suffix
.dev2… (see PEP 440 on developmental releases). For example, if the current development release is
9.7.beta9.dev1. If the current development release is the stable release
After the ticket is merged in the next development version, it will be synchronized again with the other package versions.
setup.py– a setuptools-based installation script
tox.ini– configuration for testing with tox
The technique of using symbolic links pointing into
has allowed the modularization effort to keep the
tree monolithic: Modularization has been happening behind the scenes
and will not change where Sage developers find the source files.
When adding a new distribution package that uses a symbolic link pointing into
SAGE_ROOT/src, please update
Some of these files may actually be generated from source files with suffix
.m4 by the
SAGE_ROOT/bootstrap script via the
m4 macro processor.
For every distribution package, there is also a subdirectory of
which contains the build infrastructure that is specific to Sage-the-distribution.
Note that these subdirectories follows a different naming convention,
using underscores instead of dashes, see Directory Structure.
Because the distribution packages are included in the source tree, we set them
up as “script packages” instead of “normal packages”, see Package source types.
Dependencies and distribution packages#
When preparing a portion of the Sage library as a distribution package, dependencies matter.
If the portion of the library contains any Cython modules, these
modules are compiled during the wheel-building phase of the
distribution package. If the Cython module uses
cimport to pull in
.pxd files, these files must be either part of the
portion shipped as the distribution being built, or the distribution
that provides these files must be installed in the build
environment. Also, any C/C++ libraries that the Cython module uses
must be accessible from the build environment.
Declaring build-time dependencies: Modern Python packaging provides a
mechanism to declare build-time dependencies on other distribution
packages via the file pyproject.toml
[build-system] requires); this
has superseded the older
setup_requires declaration. (There is no
mechanism to declare anything regarding the C/C++ libraries.)
While the namespace
sage.* is organized roughly according to
mathematical fields or categories, how we partition the implementation
modules into distribution packages has to respect the hard constraints
that are imposed by the build-time dependencies.
We can define some meaningful small distributions that just consist of
a single or a few Cython modules. For example, sagemath-tdlib
(trac ticket #29864) would just package the single
Cython module that must be linked with
sage.graphs.graph_decompositions.tdlib. Starting with the Sage
9.6 development cycle, as soon as namespace packages are activated, we
can start to create these distributions. This is quite a mechanical
Reducing build-time dependencies: Sometimes it is possible to replace build-time dependencies of a Cython module on a library by a runtime dependency. In other cases, it may be possible to split a module that simultaneously depends on several libraries into smaller modules, each of which has narrower dependencies.
Module-level runtime dependencies#
import statements at the top level of a Python or Cython
module are executed when the module is imported. Hence, the imported
modules must be part of the distribution, or provided by another
distribution – which then must be declared as a run-time dependency.
Declaring run-time dependencies: These dependencies are declared in
setup.cfg (generated from
Reducing module-level run-time dependencies:
Avoid importing from
sage.PAC.KAGEis a namespace package. The main purpose of the
*.allmodules is for populating the global interactive environment that is available to users at the
sage:prompt. In particular, no Sage library code should import from
Replace module-level imports by method-level imports. Note that this comes with a small runtime overhead, which can become noticeable if the method is called in tight inner loops.
Sage provides the
lazy_import()mechanism. Lazy imports can be declared at the module level, but the actual importing is only done on demand. It is a runtime error at that time if the imported module is not present. This can be convenient compared to local imports in methods when the same imports are needed in several methods.
Avoid the “modularization anti-pattern” of importing a class from another module just to run an
isinstance(object, Class)test, in particular when the module implementing
Classhas heavy dependencies. For example, importing the class
pAdicField(or the function
is_pAdicField) requires the libraries NTL and PARI.
Instead, provide an abstract base class (ABC) in a module that only has light dependencies, make
Classa subclass of
ABC, and use
isinstance(object, ABC). For example,
sage.rings.abcprovides abstract base classes for many ring (parent) classes, including
sage.rings.abc.pAdicField. So we can replace:
from sage.rings.padics.generic_nodes import pAdicFieldGeneric # heavy dependencies isinstance(object, pAdicFieldGeneric)
from sage.rings.padics.generic_nodes import is_pAdicField # heavy dependencies is_pAdicField(object) # deprecated
import sage.rings.abc # no dependencies isinstance(object, sage.rings.abc.pAdicField)
Note that going through the abstract base class only incurs a small performance penalty:
sage: object = Qp(5) sage: from sage.rings.padics.generic_nodes import pAdicFieldGeneric sage: %timeit isinstance(object, pAdicFieldGeneric) # fast # not tested 68.7 ns ± 2.29 ns per loop (...) sage: import sage.rings.abc sage: %timeit isinstance(object, sage.rings.abc.pAdicField) # also fast # not tested 122 ns ± 1.9 ns per loop (...)
If it is not possible or desired to create an abstract base class for
isinstancetesting (for example, when the class is defined in some external package), other solutions need to be used.
Note that Python caches successful module imports, but repeating an unsuccessful module import incurs a cost every time:
sage: from sage.schemes.generic.scheme import Scheme sage: sZZ = Scheme(ZZ) sage: def is_Scheme_or_Pluffe(x): ....: if isinstance(x, Scheme): ....: return True ....: try: ....: from xxxx_does_not_exist import Pluffe # slow on every call ....: except ImportError: ....: return False ....: return isinstance(x, Pluffe) sage: %timeit is_Scheme_or_Pluffe(sZZ) # fast # not tested 111 ns ± 1.15 ns per loop (...) sage: %timeit is_Scheme_or_Pluffe(ZZ) # slow # not tested 143 µs ± 2.58 µs per loop (...)
lazy_import()mechanism can be used to simplify this pattern via the
__instancecheck__()method and has similar performance characteristics:
sage: lazy_import('xxxx_does_not_exist', 'Pluffe') sage: %timeit isinstance(sZZ, (Scheme, Pluffe)) # fast # not tested 95.2 ns ± 0.636 ns per loop (...) sage: %timeit isinstance(ZZ, (Scheme, Pluffe)) # slow # not tested 158 µs ± 654 ns per loop (...)
It is faster to do the import only once, for example when loading the module, and to cache the failure. We can use the following idiom, which makes use of the fact that
isinstanceaccepts arbitrarily nested lists and tuples of types:
sage: try: ....: from xxxx_does_not_exist import Pluffe # runs once ....: except ImportError: ....: # Set to empty tuple of types for isinstance ....: Pluffe = () sage: %timeit isinstance(sZZ, (Scheme, Pluffe)) # fast # not tested 95.9 ns ± 1.52 ns per loop (...) sage: %timeit isinstance(ZZ, (Scheme, Pluffe)) # fast # not tested 126 ns ± 1.9 ns per loop (...)
Other runtime dependencies#
import statements are used within a method, the imported module
is loaded the first time that the method is called. Hence the module
defining the method can still be imported even if the module needed by
the method is not present.
It is then a question whether a run-time dependency should be declared. If the method needing that import provides core functionality, then probably yes. But if it only provides what can be considered “optional functionality”, then probably not, and in this case it will be up to the user to install the distribution enabling this optional functionality.
As an example, let us consider designing a distribution that centers
around the package
sage.coding. First, let’s see if it uses symbolics:
(9.5.beta6) $ git grep -E 'sage[.](symbolic|functions|calculus)' src/sage/coding src/sage/coding/code_bounds.py: from sage.functions.other import ceil ... src/sage/coding/grs_code.py:from sage.symbolic.ring import SR ... src/sage/coding/guruswami_sudan/utils.py:from sage.functions.other import floor
Apparently it does not in a very substantial way:
The imports of the symbolic functions
floor()can likely be replaced by the artithmetic functions
Looking at the import of
sage.coding.grs_code, it seems that
SRis used for running some symbolic sum, but the doctests do not show symbolic results, so it is likely that this can be replaced.
Note though that the above textual search for the module names is merely a heuristic. Looking at the source of “entropy”, through
sage.misc.functional, a runtime dependency on symbolics comes in. In fact, for this reason, two doctests there are already marked as
# optional - sage.symbolic.
So if packaged as sagemath-coding, now a domain expert would have
to decide whether these dependencies on symbolics are strong enough to
declare a runtime dependency (
sagemath-symbolics. This declaration would mean that any user who
installs sagemath-coding (
pip install sagemath-coding) would
pull in sagemath-symbolics, which has heavy compile-time
The alternative is to consider the use of symbolics by sagemath-coding merely as something that provides some extra features, which will only be working if the user also has installed sagemath-symbolics.
Declaring optional run-time dependencies: It is possible to declare
such optional dependencies as extras_require in
setup.cfg.m4). This is a very limited mechanism
– in particular it does not affect the build phase of the
distribution in any way. It basically only provides a way to give a
nickname to a distribution that can be installed as an add-on.
In our example, we could declare an
extras_require so that users
pip install sagemath-coding[symbolics].
Doctests often use examples constructed using functionality provided by other portions of the Sage library. This kind of integration testing is one of the strengths of Sage; but it also creates extra dependencies.
Fortunately, these dependencies are very mild, and we can deal with
them using the same mechanism that we use for making doctests
conditional on the presence of optional libraries: using
# optional -
FEATURE directives in the doctests. Adding these directives will
allow developers to test the distribution separately, without
requiring all of Sage to be present.
Declaring doctest-only dependencies: The extras_require mechanism mentioned above can also be used for this.
Version constraints of dependencies#
The version information for dependencies comes from the files
build/pkgs/*/package-version.txt. We use the
macro processor to insert the version information in the generated files
Hierarchy of distribution packages#
Solid arrows indicate
install_requires, i.e., a declared runtime dependency.
Dashed arrows indicate
extras_require, i.e., a declared optional runtime dependency.
Not shown in the diagram are build dependencies and optional dependencies for testing.
sage_conf is a configuration module. It provides the configuration variable settings determined by the
sagemath-environment provides the connection to the system and software environment. It includes
sagemath-objects provides a small fundamental subset of the modules of the Sage library, in particular all of
sage.structure, a small portion of
sage.categories, and a portion of
sagemath-categories provides a small subset of the modules of the Sage library, building upon sagemath-objects. It provides all of
sage.categoriesand a small portion of
sagemath-repl provides the IPython kernel and Sage preparser (
sage.repl), the Sage doctester (
sage.doctest), and some related modules from
Testing distribution packages#
Of course, we need tools for testing modularized distributions of portions of the Sage library.
Modularized distributions must be testable separately!
But we want to keep integration testing with other portions of Sage too!
Whenever an optional package is needed for a particular test, we use the
# optional. This mechanism can also be used for making a
doctest conditional on the presence of a portion of the Sage library.
The available tags take the form of package or module names such as
sage.symbolic. They are defined via
Feature subclasses in the module
also provides the mapping from features to the distributions providing them
(actually, to SPKG names). Using this mapping, Sage can issue installation
hints to the user.
For example, the package
sage.tensor is purely algebraic and has
no dependency on symbolics. However, there are a small number of
doctests that depend on
sage.symbolic.ring.SymbolicRing for integration
testing. Hence, these doctests are marked
# optional -
Testing the distribution in virtual environments with tox#
So how to test that this works?
Sure, we could go into the installation directory
SAGE_VENV/lib/python3.9/site-packages/ and do
sage/symbolic and test that things still work. But that’s not a good
way of testing.
Instead, we use a virtual environment in which we only install the distribution to be tested (and its Python dependencies).
Let’s try it out first with the entire Sage library, represented by
the distribution sagemath-standard. Note that after Sage has been
built normally, a set of wheels for all installed Python packages is
$ ls venv/var/lib/sage/wheels Babel-2.9.1-py2.py3-none-any.whl Cython-0.29.24-cp39-cp39-macosx_11_0_x86_64.whl Jinja2-2.11.2-py2.py3-none-any.whl ... sage_conf-9.5b6-py3-none-any.whl ... scipy-1.7.2-cp39-cp39-macosx_11_0_x86_64.whl setuptools-58.2.0-py3-none-any.whl ... wheel-0.37.0-py2.py3-none-any.whl widgetsnbextension-3.5.1-py2.py3-none-any.whl zipp-3.5.0-py3-none-any.whl
Note in particular the wheel for sage-conf, which provides
configuration variable settings and the connection to the non-Python
packages installed in
We can now set up a separate virtual environment, in which we install
these wheels and our distribution to be tested. This is where
comes into play: It is the standard Python tool for creating
disposable virtual environments for testing. Every distribution in
SAGE_ROOT/pkgs/ provides a configuration file
Following the comments in the file
SAGE_ROOT/pkgs/sagemath-standard/tox.ini, we can try the following
$ ./bootstrap && ./sage -sh -c '(cd pkgs/sagemath-standard && SAGE_NUM_THREADS=16 tox -v -v -v -e sagepython-sagewheels-nopypi)'
This command does not make any changes to the normal installation of
Sage. The virtual environment is created in a subdirectory of
SAGE_ROOT/pkgs/sagemath-standard-no-symbolics/.tox/. After the command
finishes, we can start the separate installation of the Sage library
in its virtual environment:
We can also run parts of the testsuite:
$ pkgs/sagemath-standard/.tox/sagepython-sagewheels-nopypi/bin/sage -tp 4 src/sage/graphs/
.tox directory can be safely deleted at any time.
We can do the same with other distributions, for example the large
(from trac ticket #32601), which is intended to provide
everything that is currently in the standard Sage library, i.e.,
without depending on optional packages, but without the packages
Again we can run the test with
tox in a separate virtual environment:
$ ./bootstrap && ./sage -sh -c '(cd pkgs/sagemath-standard-no-symbolics && SAGE_NUM_THREADS=16 tox -v -v -v -e sagepython-sagewheels-nopypi)'
Some small distributions, for example the ones providing the two lowest levels, sagemath-objects and sagemath-categories (from trac ticket #29865), can be installed and tested without relying on the wheels from the Sage build:
$ ./bootstrap && ./sage -sh -c '(cd pkgs/sagemath-objects && SAGE_NUM_THREADS=16 tox -v -v -v -e sagepython)'
This command finds the declared build-time and run-time dependencies
on PyPI, either as source tarballs or as prebuilt wheels, and builds
and installs the distribution
sagemath-objects in a virtual
environment in a subdirectory of
Building these small distributions serves as a valuable regression testsuite. However, a current issue with both of these distributions is that they are not separately testable: The doctests for these modules depend on a lot of other functionality from higher-level parts of the Sage library.