An efficient algorithm for computing the distance between close partitions
Abstract
A -partition of a set is a splitting of into non-overlapping classes that cover all elements of . Numerous practical applications dealing with data partitioning or clustering require computing the distance between two partitions. Previous articles proved that one can compute it in polynomial time—minimum and maximum —using a reduction to the linear assignment problem. We propose several conditions for which the partition distance can be computed in time. In practical terms, this computation can be done in time for any two relatively resembling partitions (i.e. with distance less than ) except specially constructed cases. Finally, we prove that, even if there is a bounded number of classes for which the proposed conditions are not satisfied, one can still preserve the linear complexity by exploiting decomposition properties of the similarity matrix.
Origin | Files produced by the author(s) |
---|