Offline algorithms

So far, we have discussed online algorithms for tree queries. Those algorithms are able to process queries one after another so that each query is answered before receiving the next query.

However, in many problems, the online property is not necessary. In this section, we focus on offline algorithms. Those algorithms are given a set of queries which can be answered in any order. It is often easier to design an offline algorithm compared to an online algorithm.

Merging data and structures

One method to construct an offline algorithm is to perform a depth-first tree traversal and maintain data structures in nodes. At each node $s$, we create a data structure d[s] that is based on the data structures of the children of $s$. Then, using this data structure, all queries related to $s$ are processed.

As an example, consider the following problem: We are given a tree where each node has some value. Our task is to process queries of the form ''calculate the number of nodes with value $x$ in the subtree of node $s$''. For example, in the following tree, the subtree of node $4$ contains two nodes whose value is 3.

In this problem, we can use map structures to answer the queries. For example, the maps for node 4 and its children are as follows:

If we create such a data structure for each node, we can easily process all given queries, because we can handle all queries related to a node immediately after creating its data structure. For example, the above map structure for node 4 tells us that its subtree contains two nodes whose value is 3.

However, it would be too slow to create all data structures from scratch. Instead, at each node $s$, we create an initial data structure d[s] that only contains the value of $s$. After this, we go through the children of $s$ and merge d[s] and all data structures d[u] where $u$ is a child of $s$.

For example, in the above tree, the map for node $4$ is created by merging the following maps:

Here the first map is the initial data structure for node 4, and the other three maps correspond to nodes 7, 8 and 9.

The merging at node $s$ can be done as follows: We go through the children of $s$ and at each child $u$ merge d[s] and d[u]. We always copy the contents from d[u] to d[s]. However, before this, we swap the contents of d[s] and d[u] if d[s] is smaller than d[u]. By doing this, each value is copied only $O(\log n)$ times during the tree traversal, which ensures that the algorithm is efficient.

To swap the contents of two data structures $a$ and $b$ efficiently, we can just use the following code:

#![allow(unused)]
fn main() {
let mut a = 5;
let mut b = 6;
println!("a:{a}  b:{b}");
(a, b) = (b, a);
println!("a:{a}  b:{b}");
}

or in alternative:

#![allow(unused)]
fn main() {
let mut a = 5;
let mut b = 6;
println!("a:{a}  b:{b}");
std::mem::swap(&mut a, &mut b);
println!("a:{a}  b:{b}");
}

Lowest common ancestor

There is also an offline algorithm for processing a set of lowest common ancestor queries1. The algorithm is based on the union-find data structure (see Chapter 15.2), and the benefit of the algorithm is that it is easier to implement than the algorithms discussed earlier in this chapter.

The algorithm is given as input a set of pairs of nodes, and it determines for each such pair the lowest common ancestor of the nodes. The algorithm performs a depth-first tree traversal and maintains disjoint sets of nodes. Initially, each node belongs to a separate set. For each set, we also store the highest node in the tree that belongs to the set.

When the algorithm visits a node $x$, it goes through all nodes $y$ such that the lowest common ancestor of $x$ and $y$ has to be found. If $y$ has already been visited, the algorithm reports that the lowest common ancestor of $x$ and $y$ is the highest node in the set of $y$. Then, after processing node $x$, the algorithm joins the sets of $x$ and its parent.

For example, suppose that we want to find the lowest common ancestors of node pairs $(5,8)$ and $(2,7)$ in the following tree:

In the following trees, gray nodes denote visited nodes and dashed groups of nodes belong to the same set. When the algorithm visits node 8, it notices that node 5 has been visited and the highest node in its set is 2. Thus, the lowest common ancestor of nodes 5 and 8 is 2:

Later, when visiting node 7, the algorithm determines that the lowest common ancestor of nodes 2 and 7 is 1:


1 This algorithm was published by R. E. Tarjan in 1979 [65].