Utility functions on strings#

sage.monoids.string_ops.coincidence_discriminant(S, n=2)[source]#

INPUT:

A tuple of strings, e.g. produced as decimation of transposition ciphertext, or a sample plaintext.

OUTPUT:

A measure of the difference of probability of association of character pairs, relative to their independent one-character probabilities.

EXAMPLES:

sage: S = strip_encoding("The cat in the hat.")
sage: coincidence_discriminant([ S[i:i+2] for i in range(len(S)-1) ])
0.0827001855677322
>>> from sage.all import *
>>> S = strip_encoding("The cat in the hat.")
>>> coincidence_discriminant([ S[i:i+Integer(2)] for i in range(len(S)-Integer(1)) ])
0.0827001855677322
sage.monoids.string_ops.coincidence_index(S, n=1)[source]#

Return the coincidence index of the string S.

EXAMPLES:

sage: S = strip_encoding("The cat in the hat.")
sage: coincidence_index(S)
0.120879120879121
>>> from sage.all import *
>>> S = strip_encoding("The cat in the hat.")
>>> coincidence_index(S)
0.120879120879121
sage.monoids.string_ops.frequency_distribution(S, n=1, field=None)[source]#

The probability space of frequencies of n-character substrings of S.

sage.monoids.string_ops.strip_encoding(S)[source]#

Return the upper case string of S stripped of all non-alphabetic characters.

EXAMPLES:

sage: S = "The cat in the hat."
sage: strip_encoding(S)
'THECATINTHEHAT'
>>> from sage.all import *
>>> S = "The cat in the hat."
>>> strip_encoding(S)
'THECATINTHEHAT'