Union-find structure

A union-find structure maintains a collection of sets. The sets are disjoint, so no element belongs to more than one set. Two $O(\log n)$ time operations are supported: the unite operation joins two sets, and the find operation finds the representative of the set that contains a given element.

Structure

In a union-find structure, one element in each set is the representative of the set, and there is a chain from any other element of the set to the representative. For example, assume that the sets are ${1,4,7}$, ${5}$ and ${2,3,6,8}$:

In this case the representatives of the sets are 4, 5 and 2. We can find the representative of any element by following the chain that begins at the element. For example, the element 2 is the representative for the element 6, because we follow the chain $6 \rightarrow 3 \rightarrow 2$. Two elements belong to the same set exactly when their representatives are the same.

Two sets can be joined by connecting the representative of one set to the representative of the other set. For example, the sets {1,4,7} and {2,3,6,8} can be joined as follows:

The resulting set contains the elements {1,2,3,4,6,7,8}. From this on, the element 2 is the representative for the entire set and the old representative 4 points to the element 2.

The efficiency of the union-find structure depends on how the sets are joined. It turns out that we can follow a simple strategy: always connect the representative of the \emph{smaller} set to the representative of the \emph{larger} set (or if the sets are of equal size, we can make an arbitrary choice). Using this strategy, the length of any chain will be $O(\log n)$, so we can find the representative of any element efficiently by following the corresponding chain.

Implementation

The union-find structure can be implemented using arrays. In the following implementation, the array link contains for each element the next element in the chain or the element itself if it is a representative, and the array size indicates for each representative the size of the corresponding set.

Initially, each element belongs to a separate set:

for i in 1..=n {link[i] = i};
for i in 1..=n {size[i] = i};

The function find returns the representative for an element $x$. The representative can be found by following the chain that begins at $x$.

#![allow(unused)]
fn main() {
fn find(mut x:usize, link: &[usize]) -> usize {
    while x != link[x] {x = link[x]}
    x
}
}

The function same checks whether elements $a$ and $b$ belong to the same set. This can easily be done by using the function find:

#![allow(unused)]
fn main() {
fn find(mut x:usize, link: &[usize]) -> usize {
    while x != link[x] {x = link[x]}
    x
}
fn same(a: usize, b: usize, link: &[usize]) -> bool {
    find(a, link) == find(b, link)
}
}

The function unite joins the sets that contain elements $a$ and $b$ (the elements have to be in different sets). The function first finds the representatives of the sets and then connects the smaller set to the larger set.

#![allow(unused)]
fn main() {
fn find(mut x:usize, link: &[usize]) -> usize {
    while x != link[x] {x = link[x]}
    x
}
fn swap(a: usize, b: usize){
todo!();
}
fn unite(mut a: usize, mut b: usize, link: &mut [usize], size: &mut [usize]) {
    a = find(a, link);
    b = find(b, link);
    if size[a] < size[b] {swap(a,b)}
    size[a] += size[b];
    link[b] = a;
}
}

The time complexity of the function find is $O(\log n)$ assuming that the length of each chain is $O(\log n)$. In this case, the functions same and unite also work in $O(\log n)$ time. The function unite makes sure that the length of each chain is $O(\log n)$ by connecting the smaller set to the larger set.


1 The structure presented here was introduced in 1971 by J. D. Hopcroft and J. D. Ullman [38]. Later, in 1975, R. E. Tarjan studied a more sophisticated variant of the structure [64] that is discussed in many algorithm textbooks nowadays.