A tool for inspecting Python pickles¶
AUTHORS:
 Carl Witty (200903)
The explain_pickle function takes a pickle and produces Sage code that will evaluate to the contents of the pickle. Ideally, the combination of explain_pickle to produce Sage code and sage_eval to evaluate the code would be a 100% compatible implementation of cPickle’s unpickler; this is almost the case now.
EXAMPLES:
sage: explain_pickle(dumps(12345))
pg_make_integer = unpickle_global('sage.rings.integer', 'make_integer')
pg_make_integer('c1p')
sage: explain_pickle(dumps(polygen(QQ)))
pg_Polynomial_rational_flint = unpickle_global('sage.rings.polynomial.polynomial_rational_flint', 'Polynomial_rational_flint')
pg_unpickle_PolynomialRing = unpickle_global('sage.rings.polynomial.polynomial_ring_constructor', 'unpickle_PolynomialRing')
pg_RationalField = unpickle_global('sage.rings.rational_field', 'RationalField')
pg = unpickle_instantiate(pg_RationalField, ())
pg_make_rational = unpickle_global('sage.rings.rational', 'make_rational')
pg_Polynomial_rational_flint(pg_unpickle_PolynomialRing(pg, ('x',), None, False), [pg_make_rational('0'), pg_make_rational('1')], False, True)
sage: sage_eval(explain_pickle(dumps(polygen(QQ)))) == polygen(QQ)
True
By default (as above) the code produced contains calls to several utility functions (unpickle_global, etc.); this is done so that the code is truly equivalent to the pickle. If the pickle can be loaded into a future version of Sage, then the code that explain_pickle produces today should work in that future Sage as well.
It is also possible to produce simpler code, that is tied to the current version of Sage; here are the above two examples again:
sage: explain_pickle(dumps(12345), in_current_sage=True)
from sage.rings.integer import make_integer
make_integer('c1p')
sage: explain_pickle(dumps(polygen(QQ)), in_current_sage=True)
from sage.rings.polynomial.polynomial_rational_flint import Polynomial_rational_flint
from sage.rings.polynomial.polynomial_ring_constructor import unpickle_PolynomialRing
from sage.rings.rational import make_rational
Polynomial_rational_flint(unpickle_PolynomialRing(RationalField(), ('x',), None, False), [make_rational('0'), make_rational('1')], False, True)
The explain_pickle function has several use cases.
Write pickling support for your classes
You can use explain_pickle to see what will happen when a pickle is unpickled. Consider: is this sequence of commands something that can be easily supported in all future Sage versions, or does it expose internal design decisions that are subject to change?
Debug old pickles
If you have a pickle from an old version of Sage that no longer unpickles, you can use explain_pickle to see what it is trying to do, to figure out how to fix it.
Use explain_pickle in doctests to help maintenance
If you have a
loads(dumps(S))
doctest, you could also add anexplain_pickle(dumps(S))
doctest. Then if something changes in a way that would invalidate old pickles, the output ofexplain_pickle
will also change. At that point, you can add the previous output ofexplain_pickle
as a new set of doctests (and then update theexplain_pickle
doctest to use the new output), to ensure that old pickles will continue to work.
As mentioned above, there are several output modes for explain_pickle
,
that control fidelity versus simplicity of the output. For example,
the GLOBAL instruction takes a module name and a class name and
produces the corresponding class. So GLOBAL of sage.rings.integer
,
Integer
is approximately equivalent to sage.rings.integer.Integer
.
However, this class lookup process can be customized (using
sage.misc.persist.register_unpickle_override). For instance,
if some future version of Sage renamed sage/rings/integer.pyx
to
sage/rings/knuth_was_here.pyx
, old pickles would no longer work unless
register_unpickle_override was used; in that case, GLOBAL of
‘sage.rings.integer’, ‘integer’ would mean
sage.rings.knuth_was_here.integer
.
By default, explain_pickle
will map this GLOBAL instruction to
unpickle_global('sage.rings.integer', 'integer')
. Then when this code
is evaluated, unpickle_global will look up the current mapping in the
register_unpickle_override table, so the generated code will continue
to work even in hypothetical future versions of Sage where integer.pyx
has been renamed.
If you pass the flag in_current_sage=True
, then
explain_pickle
will generate code that may only work in the
current version of Sage, not in future versions. In this case, it
would generate:
from sage.rings.integer import integer
and if you ran explain_pickle in hypothetical future sage, it would generate:
from sage.rings.knuth_was_here import integer
but the current code wouldn’t work in the future sage.
If you pass the flag default_assumptions=True
, then
explain_pickle
will generate code that would work in the
absence of any special unpickling information. That is, in either
current Sage or hypothetical future Sage, it would generate:
from sage.rings.integer import integer
The intention is that default_assumptions
output is prettier (more
humanreadable), but may not actually work; so it is only intended for
human reading.
There are several functions used in the output of explain_pickle
.
Here I give a brief description of what they usually do, as well as
how to modify their operation (for instance, if you’re trying to get
old pickles to work).
unpickle_global(module, classname)
: unpickle_global(‘sage.foo.bar’, ‘baz’) is usually equivalent to sage.foo.bar.baz, but this can be customized with register_unpickle_override.unpickle_newobj(klass, args)
: Usually equivalent toklass.__new__(klass, *args)
. Ifklass
is a Python class, then you can define__new__()
to control the result (this result actually need not be an instance of klass). (This doesn’t work for Cython classes.)unpickle_build(obj, state)
: Ifobj
has a__setstate__()
method, then this is equivalent toobj.__setstate__(state)
. Otherwise uses state to set the attributes ofobj
. Customize by defining__setstate__()
.unpickle_instantiate(klass, args)
: Usually equivalent toklass(*args)
. Cannot be customized. unpickle_appends(lst, vals): Appends the values in vals to lst. If not
isinstance(lst, list)
, can be customized by defining aappend()
method.

class
sage.misc.explain_pickle.
EmptyNewstyleClass
¶ Bases:
object
A featureless newstyle class (inherits from object); used for testing explain_pickle.

class
sage.misc.explain_pickle.
EmptyOldstyleClass
¶ A featureless oldstyle class (does not inherit from object); used for testing explain_pickle.

class
sage.misc.explain_pickle.
PickleDict
(items)¶ Bases:
object
An object which can be used as the value of a PickleObject. The items is a list of keyvalue pairs, where the keys and values are SageInputExpressions. We use this to help construct dictionary literals, instead of always starting with an empty dictionary and assigning to it.

class
sage.misc.explain_pickle.
PickleExplainer
(sib, in_current_sage=False, default_assumptions=False, pedantic=False)¶ Bases:
object
An interpreter for the pickle virtual machine, that executes symbolically and constructs SageInputExpressions instead of directly constructing values.

APPEND
()¶

APPENDS
()¶

BINFLOAT
(f)¶

BINGET
(n)¶

BININT
(n)¶

BININT1
(n)¶

BININT2
(n)¶

BINPERSID
()¶

BINPUT
(n)¶

BINSTRING
(s)¶

BINUNICODE
(s)¶

BUILD
()¶

DICT
()¶

DUP
()¶

EMPTY_DICT
()¶

EMPTY_LIST
()¶

EMPTY_TUPLE
()¶

EXT1
(n)¶

EXT2
(n)¶

EXT4
(n)¶

FLOAT
(f)¶

GET
(n)¶

GLOBAL
(name)¶

INST
(name)¶

INT
(n)¶

LIST
()¶

LONG
(n)¶

LONG1
(n)¶

LONG4
(n)¶

LONG_BINGET
(n)¶

LONG_BINPUT
(n)¶

MARK
()¶

NEWFALSE
()¶

NEWOBJ
()¶

NEWTRUE
()¶

NONE
()¶

OBJ
()¶

PERSID
(id)¶

POP
()¶

POP_MARK
()¶

PROTO
(proto)¶

PUT
(n)¶

REDUCE
()¶

SETITEM
()¶

SETITEMS
()¶

SHORT_BINSTRING
(s)¶

STOP
()¶

STRING
(s)¶

TUPLE
()¶

TUPLE1
()¶

TUPLE2
()¶

TUPLE3
()¶

UNICODE
(s)¶

check_value
(v)¶ Check that the given value is either a SageInputExpression or a PickleObject. Used for internal sanity checking.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: pe.check_value(7) Traceback (most recent call last): ... AssertionError sage: pe.check_value(sib(7))

is_mutable_pickle_object
(v)¶ Test whether a PickleObject is mutable (has never been converted to a SageInputExpression).
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: v = PickleObject(1, sib(1)) sage: pe.is_mutable_pickle_object(v) True sage: sib(v) {atomic:1} sage: pe.is_mutable_pickle_object(v) False

pop
()¶ Pop a value from the virtual machine’s stack, and return it.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: pe.push(sib(7)) sage: pe.pop() {atomic:7}

pop_to_mark
()¶ Pop all values down to the ‘mark’ from the virtual machine’s stack, and return the values as a list.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: pe.push_mark() sage: pe.push(sib(7)) sage: pe.push(sib('hello')) sage: pe.pop_to_mark() [{atomic:7}, {atomic:'hello'}]

push
(v)¶ Push a value onto the virtual machine’s stack.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: pe.push(sib(7)) sage: pe.stack[1] {atomic:7}
Push a value onto the virtual machine’s stack; also mark it as shared for sage_input if we are in pedantic mode.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: pe.push_and_share(sib(7)) sage: pe.stack[1] {atomic:7} sage: pe.stack[1]._sie_share True

push_mark
()¶ Push a ‘mark’ onto the virtual machine’s stack.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: pe.push_mark() sage: pe.stack[1] 'mark' sage: pe.stack[1] is the_mark True

run_pickle
(p)¶ Given an (uncompressed) pickle as a string, run the pickle in this virtual machine. Once a STOP has been executed, return the result (a SageInputExpression representing code which, when evaluated, will give the value of the pickle).
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: sib(pe.run_pickle('T\5\0\0\0hello.')) # py2 {atomic:'hello'}
Mark a sage_input value as shared, if we are in pedantic mode.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: from sage.misc.sage_input import SageInputBuilder sage: sib = SageInputBuilder() sage: pe = PickleExplainer(sib, in_current_sage=True, default_assumptions=False, pedantic=True) sage: v = sib(7) sage: v._sie_share False sage: pe.share(v) {atomic:7} sage: v._sie_share True


class
sage.misc.explain_pickle.
PickleInstance
(klass)¶ Bases:
object
An object which can be used as the value of a PickleObject. Unlike other possible values of a PickleObject, a PickleInstance doesn’t represent an exact value; instead, it gives the class (type) of the object.

class
sage.misc.explain_pickle.
PickleObject
(value, expression)¶ Bases:
object
Pickles have a stackbased virtual machine. The explain_pickle pickle interpreter mostly uses SageInputExpressions, from sage_input, as the stack values. However, sometimes we want some more information about the value on the stack, so that we can generate better (prettier, less confusing) code. In such cases, we push a PickleObject instead of a SageInputExpression. A PickleObject contains a value (which may be a standard Python value, or a PickleDict or PickleInstance), an expression (a SageInputExpression), and an “immutable” flag (which checks whether this object has been converted to a SageInputExpression; if it has, then we must not mutate the object, since the SageInputExpression would not reflect the changes).

class
sage.misc.explain_pickle.
TestAppendList
¶ Bases:
list
A subclass of list, with deliberatelybroken append and extend methods. Used for testing explain_pickle.

append
()¶ A deliberately broken append method.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: v = TestAppendList() sage: v.append(7) # py2 Traceback (most recent call last): ... TypeError: append() takes exactly 1 argument (2 given) sage: v.append(7) # py3 Traceback (most recent call last): ... TypeError: append() takes 1 positional argument but 2 were given
 We can still append by directly using the list method:
 sage: list.append(v, 7) sage: v [7]

extend
()¶ A deliberately broken extend method.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: v = TestAppendList() sage: v.extend([3,1,4,1,5,9]) # py2 Traceback (most recent call last): ... TypeError: extend() takes exactly 1 argument (2 given) sage: v.extend([3,1,4,1,5,9]) # py3 Traceback (most recent call last): ... TypeError: extend() takes 1 positional argument but 2 were given
 We can still extend by directly using the list method:
 sage: list.extend(v, (3,1,4,1,5,9)) sage: v [3, 1, 4, 1, 5, 9]


class
sage.misc.explain_pickle.
TestAppendNonlist
¶ Bases:
object
A listlike class, carefully designed to test exact unpickling behavior. Used for testing explain_pickle.

class
sage.misc.explain_pickle.
TestBuild
¶ Bases:
object
A simple class with a __getstate__ but no __setstate__. Used for testing explain_pickle.

class
sage.misc.explain_pickle.
TestBuildSetstate
¶ Bases:
sage.misc.explain_pickle.TestBuild
A simple class with a __getstate__ and a __setstate__. Used for testing explain_pickle.

class
sage.misc.explain_pickle.
TestGlobalFunnyName
¶ Bases:
object
A featureless newstyle class which has a name that’s not a legal Python identifier.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: globals()['funny$name'] = TestGlobalFunnyName # see comment at end of file sage: TestGlobalFunnyName.__name__ 'funny$name' sage: globals()['funny$name'] is TestGlobalFunnyName True

class
sage.misc.explain_pickle.
TestGlobalNewName
¶ Bases:
object
A featureless newstyle class. When you try to unpickle an instance of TestGlobalOldName, it is redirected to create an instance of this class instead. Used for testing explain_pickle.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: loads(dumps(TestGlobalOldName())) TestGlobalNewName

class
sage.misc.explain_pickle.
TestGlobalOldName
¶ Bases:
object
A featureless newstyle class. When you try to unpickle an instance of this class, it is redirected to create a TestGlobalNewName instead. Used for testing explain_pickle.
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: loads(dumps(TestGlobalOldName())) TestGlobalNewName

class
sage.misc.explain_pickle.
TestReduceGetinitargs
¶ An oldstyle class with a __getinitargs__ method. Used for testing explain_pickle.

class
sage.misc.explain_pickle.
TestReduceNoGetinitargs
¶ An oldstyle class with no __getinitargs__ method. Used for testing explain_pickle.

sage.misc.explain_pickle.
explain_pickle
(pickle=None, file=None, compress=True, **kwargs)¶ Explain a pickle. That is, produce source code such that evaluating the code is equivalent to loading the pickle. Feeding the result of
explain_pickle
tosage_eval
should be totally equivalent to loading thepickle
withcPickle
.INPUT:
pickle
– the pickle to explain, as a string (default: None)file
– a filename of a pickle (default: None)compress
– if False, don’t attempt to decompress the pickle (default: True)
in_current_sage
– if True, produce potentially simpler code that is tied to the current version of Sage. (default: False)
default_assumptions
– if True, produce potentially simpler code that assumes that generic unpickling code will be used. This code may not actually work. (default: False)
eval
– if True, then evaluate the resulting code and return the evaluated result. (default: False)
preparse
– if True, then produce code to be evaluated with Sage’s preparser; if False, then produce standard Python code; if None, then produce code that will work either with or without the preparser. (default: True)
pedantic
– if True, then carefully ensures that the result has at least as much sharing as the result of cPickle (it may have more, for immutable objects). (default: False)
Exactly one of
pickle
(a string containing a pickle) orfile
(the filename of a pickle) must be provided.EXAMPLES:
sage: explain_pickle(dumps({('a', 'b'): [1r, 2r]})) {('a', 'b'):[1r, 2r]} sage: explain_pickle(dumps(RR(pi)), in_current_sage=True) from sage.rings.real_mpfr import __create__RealNumber_version0 from sage.rings.real_mpfr import __create__RealField_version0 __create__RealNumber_version0(__create__RealField_version0(53r, False, 'RNDN'), '[email protected]', 32r) sage: s = 'hi' sage: explain_pickle(dumps((s, s))) ('hi', 'hi') sage: explain_pickle(dumps((s, s)), pedantic=True) si = 'hi' (si, si) sage: explain_pickle(dumps(5r)) 5r sage: explain_pickle(dumps(5r), preparse=False) 5 sage: explain_pickle(dumps(5r), preparse=None) int(5) sage: explain_pickle(dumps(22/7)) pg_make_rational = unpickle_global('sage.rings.rational', 'make_rational') pg_make_rational('m/7') sage: explain_pickle(dumps(22/7), in_current_sage=True) from sage.rings.rational import make_rational make_rational('m/7') sage: explain_pickle(dumps(22/7), default_assumptions=True) from sage.rings.rational import make_rational make_rational('m/7')

sage.misc.explain_pickle.
explain_pickle_string
(pickle, in_current_sage=False, default_assumptions=False, eval=False, preparse=True, pedantic=False)¶ This is a helper function for explain_pickle. It takes a decompressed pickle string as input; other than that, its options are all the same as explain_pickle.
EXAMPLES:
sage: sage.misc.explain_pickle.explain_pickle_string(dumps("Hello, world", compress=False)) 'Hello, world'
(See the documentation for
explain_pickle
for many more examples.)

sage.misc.explain_pickle.
name_is_valid
(name)¶ Test whether a string is a valid Python identifier. (We use a conservative test, that only allows ASCII identifiers.)
EXAMPLES:
sage: from sage.misc.explain_pickle import name_is_valid sage: name_is_valid('fred') True sage: name_is_valid('Yes!ValidName') False sage: name_is_valid('_happy_1234') True

sage.misc.explain_pickle.
test_pickle
(p, verbose_eval=False, pedantic=False, args=())¶ Tests explain_pickle on a given pickle p. p can be:
 a string containing an uncompressed pickle (which will always end with a ‘.’)
 a string containing a pickle fragment (not ending with ‘.’) test_pickle will synthesize a pickle that will push args onto the stack (using persistent IDs), run the pickle fragment, and then STOP (if the string ‘mark’ occurs in args, then a mark will be pushed)
 an arbitrary object; test_pickle will pickle the object
Once it has a pickle, test_pickle will print the pickle’s disassembly, run explain_pickle with in_current_sage=True and False, print the results, evaluate the results, unpickle the object with cPickle, and compare all three results.
If verbose_eval is True, then test_pickle will print messages before evaluating the pickles; this is to allow for tests where the unpickling prints messages (to verify that the same operations occur in all cases).
EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: test_pickle(['a']) # py2 0: \x80 PROTO 2 2: ] EMPTY_LIST 3: q BINPUT 1 5: U SHORT_BINSTRING 'a' 8: a APPEND 9: . STOP highest protocol among opcodes = 2 explain_pickle in_current_sage=True/False: ['a'] result: ['a']

sage.misc.explain_pickle.
unpickle_appends
(lst, vals)¶ Given a list (or listlike object) and a sequence of values, appends the values to the end of the list. This is careful to do so using the exact same technique that cPickle would use. Used by
explain_pickle
.EXAMPLES:
sage: v = [] sage: unpickle_appends(v, (1, 2, 3)) sage: v [1, 2, 3]

sage.misc.explain_pickle.
unpickle_build
(obj, state)¶ Set the state of an object. Used by
explain_pickle
.EXAMPLES:
sage: from sage.misc.explain_pickle import * sage: v = EmptyNewstyleClass() sage: unpickle_build(v, {'hello': 42}) sage: v.hello 42

sage.misc.explain_pickle.
unpickle_extension
(code)¶ Takes an integer index and returns the extension object with that index. Used by
explain_pickle
.EXAMPLES:
sage: from six.moves.copyreg import * sage: add_extension('sage.misc.explain_pickle', 'EmptyNewstyleClass', 42) sage: unpickle_extension(42) <class 'sage.misc.explain_pickle.EmptyNewstyleClass'> sage: remove_extension('sage.misc.explain_pickle', 'EmptyNewstyleClass', 42)

sage.misc.explain_pickle.
unpickle_instantiate
(fn, args)¶ Instantiate a new object of class fn with arguments args. Almost always equivalent to
fn(*args)
. Used byexplain_pickle
.EXAMPLES:
sage: unpickle_instantiate(Integer, ('42',)) 42

sage.misc.explain_pickle.
unpickle_newobj
(klass, args)¶ Create a new object; this corresponds to the C code klass>tp_new(klass, args, NULL). Used by
explain_pickle
.EXAMPLES:
sage: unpickle_newobj(tuple, ([1, 2, 3],)) (1, 2, 3)

sage.misc.explain_pickle.
unpickle_persistent
(s)¶ Takes an integer index and returns the persistent object with that index; works by calling whatever callable is stored in unpickle_persistent_loader. Used by
explain_pickle
.EXAMPLES:
sage: import sage.misc.explain_pickle sage: sage.misc.explain_pickle.unpickle_persistent_loader = lambda n: n+7 sage: unpickle_persistent(35) 42