Utility functions on strings¶

sage.monoids.string_ops.coincidence_discriminant(S, n=2)[source]¶

INPUT:

S –tuple of strings; e.g. produced as decimation of transposition ciphertext, or a sample plaintext

OUTPUT:

A measure of the difference of probability of association of character pairs, relative to their independent one-character probabilities.

EXAMPLES:

Sage

sage: S = strip_encoding("The cat in the hat.")
sage: coincidence_discriminant([ S[i:i+2] for i in range(len(S)-1) ])
0.0827001855677322

Python

>>> from sage.all import *
>>> S = strip_encoding("The cat in the hat.")
>>> coincidence_discriminant([ S[i:i+Integer(2)] for i in range(len(S)-Integer(1)) ])
0.0827001855677322

sage.monoids.string_ops.coincidence_index(S, n=1)[source]¶

Return the coincidence index of the string S.

EXAMPLES:

Sage

sage: S = strip_encoding("The cat in the hat.")
sage: coincidence_index(S)
0.120879120879121

Python

>>> from sage.all import *
>>> S = strip_encoding("The cat in the hat.")
>>> coincidence_index(S)
0.120879120879121

sage.monoids.string_ops.frequency_distribution(S, n=1, field=None)[source]¶

The probability space of frequencies of n-character substrings of S.

EXAMPLES:

Sage

sage: frequency_distribution('banana not a nana nor ananas', 2)
Discrete probability space defined by {' a': 0.0740740740740741,
 ' n': 0.111111111111111,
 'a ': 0.111111111111111,
 'an': 0.185185185185185,
 'as': 0.0370370370370370,
 'ba': 0.0370370370370370,
 'na': 0.222222222222222,
 'no': 0.0740740740740741,
 'or': 0.0370370370370370,
 'ot': 0.0370370370370370,
 'r ': 0.0370370370370370,
 't ': 0.0370370370370370}

Python

>>> from sage.all import *
>>> frequency_distribution('banana not a nana nor ananas', Integer(2))
Discrete probability space defined by {' a': 0.0740740740740741,
 ' n': 0.111111111111111,
 'a ': 0.111111111111111,
 'an': 0.185185185185185,
 'as': 0.0370370370370370,
 'ba': 0.0370370370370370,
 'na': 0.222222222222222,
 'no': 0.0740740740740741,
 'or': 0.0370370370370370,
 'ot': 0.0370370370370370,
 'r ': 0.0370370370370370,
 't ': 0.0370370370370370}

sage.monoids.string_ops.strip_encoding(S)[source]¶

Return the upper case string of S stripped of all non-alphabetic characters.

EXAMPLES:

Sage

sage: S = "The cat in the hat."
sage: strip_encoding(S)
'THECATINTHEHAT'

Python

>>> from sage.all import *
>>> S = "The cat in the hat."
>>> strip_encoding(S)
'THECATINTHEHAT'