Hypergraph isomorphic copy search#

This module implements a code for the following problem:

INPUT: two hypergraphs \(H_1, H_2\)

OUTPUT: a copy of \(H_2\) in \(H_1\)

It is also possible to enumerate all such copies, and to require that such copies be induced copies. More formally:

A copy of \(H_2\) in \(H_1\) is an injection \(f:V(H_2)\mapsto V(H_1)\) such that for any set \(S_2\in E(H_2)\) we have \(f(S_2)\in E(H_1)\).

It is an induced copy if no other set of \(E(H_1)\) is contained in \(f(V(H_2))\), i.e. \(|E(H_2)|=\{S:S\in E(H_1)\text{ and }S\subseteq f(V(H_2))\}\).

The functions implemented here lists all such injections. In particular, the number of copies of \(H\) in itself is equal to \(|Aut(H)|\).

The feature is available through IncidenceStructure.isomorphic_substructures_iterator().

Implementation#

A hypergraph is stored as a list of edges, each of which is a “dense” bitset over \(|V(H_1)|\) points. In particular, two sets of distinct cardinalities require the same memory space. A hypergraph is a C struct with the following fields:

n,m (int) – number of points and edges.
limbs (int) – number of 64-bits blocks per set.
set_space (uint64_t *) – address of the memory used to store the sets.
sets (uint64_t **) – sets[i] points toward the limbs blocks encoding set \(i\). Note also that sets[i][limbs] is equal to the cardinality of set[i], so that sets has length m*(limbs+1)*sizeof(uint64_t).
names (int *) – associates an integer ‘name’ to each of the n points.

The operations used on this data structure are:

void permute(hypergraph * h, int n1, int n2) – exchanges points \(n1\) and \(n2\) in the data structure. Note that their names are also exchanged so that we still know which is which.
int induced_hypergraph(hypergraph * h1, int n, hypergraph * tmp1) – stores in tmp1 the hypergraph induced by the first \(n\) points, i.e. all sets \(S\) such that \(S\subseteq \{0,...,n-1\}\). The function returns the number of such sets.
void trace_hypergraph64(hypergraph * h, int n, hypergraph * tmp) – stores in tmp1 the trace of \(h\) on the first \(n\) points, i.e. all sets of the form \(S\cap \{0, \ldots, n-1\}\).

Algorithm#

We try all possible assignments of a representant \(r_i\in H_1\) for every \(i\in H_2\). When we have picked a representant for the first \(n<\) points \(\{0, \ldots, n-1\}\subsetneq V(H_2)\), we check that:

The hypergraph induced by the (ordered) list \(0, \ldots, n-1\) in \(H_2\) is equal to the one induced by \(r_0, \ldots, r_{n-1}\) in \(H_1\).
If \(S\subseteq \{0,...,n-1\}\) is contained in \(c\) sets of size \(k\) in \(H_2\), then \(\{r_i:i\in S\}\) is contained in \(\geq c\) sets of size \(k\) in \(H_1\). This is done by comparing the trace of the hypergraphs while remembering the original size of each set.

As we very often need to build the hypergraph obtained by the trace of the first \(n\) points (for all possible \(n\)), those hypergraphs are cached. The hypergraphs induced by the same points are handled similarly.

Limitations#

Number of points For efficiency reason the implementation assumes that \(H_2\) has \(\leq 64\) points. Making this work for larger values means that calls to qsort have to be replaced by calls to qsort_r (i.e. to sort the edges you need to know the number of limbs per edge) and that induces a big slowdown for small cases (~50% when this code was implemented). Also, 64 points for \(H_2\) is already very very big considering the problem at hand. Even \(|V(H_1)|> 64\) seems too much.

Vertex ordering The order of vertices in \(H_2\) has a huge influence on the performance of the algorithm. If no set of \(H_2\) contains more that one of the first \(k<n\) points, then almost all partial assignments of representants are possible for the first \(k\) points (though the degree of the vertices is taken into account). For this reason it is best to pick an ordering such that the first vertices are contained in as many sets as possible together. A heuristic is implemented at relabel_heuristic().

AUTHORS:

Nathann Cohen (November 2014, written in various airports between Nice and Chennai).

Methods#

class sage.combinat.designs.subhypergraph_search.SubHypergraphSearch#

Bases: object

relabel_heuristic()[source]#

Relabels \(H_2\) in order to make the algorithm faster.

Objective: we try to pick an ordering \(p_1,...,p_k\) of the points of \(H_2\) that maximizes the number of sets involving the first points in the ordering. One way to formalize the problems indicates that it may be NP-Hard (generalizes the max clique problem for graphs) so we do not try to solve it exactly: we just need a sufficiently good heuristic.

Assuming that the first points are \(p_1,...,p_k\), we determine \(p_{k+1}\) as the point \(x\) such that the number of sets \(S\) with \(x\in S\) and \(S\cap \{p_1,...,p_k\}\neq \emptyset\) is maximal. In case of ties, we take a point with maximum degree.

This function is called when an instance of SubHypergraphSearch is created.

EXAMPLES:

Sage

sage: d = designs.projective_plane(3)                                       # needs sage.schemes
sage: d.isomorphic_substructures_iterator(d).relabel_heuristic()            # needs sage.schemes

Python

>>> from sage.all import *
>>> d = designs.projective_plane(Integer(3))                                       # needs sage.schemes
>>> d.isomorphic_substructures_iterator(d).relabel_heuristic()            # needs sage.schemes