Lowest common ancestor

The lowest common ancestor of two nodes of a rooted tree is the lowest node whose subtree contains both the nodes. A typical problem is to efficiently process queries that ask to find the lowest common ancestor of two nodes.

For example, in the following tree, the lowest common ancestor of nodes 5 and 8 is node 2:

Next we will discuss two efficient techniques for finding the lowest common ancestor of two nodes.

Method 1

One way to solve the problem is to use the fact that we can efficiently find the $k$th ancestor of any node in the tree. Using this, we can divide the problem of finding the lowest common ancestor into two parts.

We use two pointers that initially point to the two nodes whose lowest common ancestor we should find. First, we move one of the pointers upwards so that both pointers point to nodes at the same level.

In the example scenario, we move the second pointer one level up so that it points to node 6 which is at the same level with node 5:

After this, we determine the minimum number of steps needed to move both pointers upwards so that they will point to the same node. The node to which the pointers point after this is the lowest common ancestor.

In the example scenario, it suffices to move both pointers one step upwards to node 2, which is the lowest common ancestor:

Since both parts of the algorithm can be performed in $O(\log n)$ time using precomputed information, we can find the lowest common ancestor of any two nodes in $O(\log n)$ time.

Method 2

Another way to solve the problem is based on a tree traversal array\footnote{}. Once again, the idea is to traverse the nodes using a depth-first search:

However, we use a different tree traversal array than before: we add each node to the array \emph{always} when the depth-first search walks through the node, and not only at the first visit. Hence, a node that has $k$ children appears $k+1$ times in the array and there are a total of $2n-1$ nodes in the array.

We store two values in the array: the identifier of the node and the depth of the node in the tree. The following array corresponds to the above tree:

Now we can find the lowest common ancestor of nodes $a$ and $b$ by finding the node with the minimum depth between nodes $a$ and $b$ in the array. For example, the lowest common ancestor of nodes $5$ and $8$ can be found as follows:

Node 5 is at position 2, node 8 is at position 5, and the node with minimum depth between positions $2 \ldots 5$ is node 2 at position 3 whose depth is 2. Thus, the lowest common ancestor of nodes 5 and 8 is node 2.

Thus, to find the lowest common ancestor of two nodes it suffices to process a range minimum query. Since the array is static, we can process such queries in $O(1)$ time after an $O(n \log n)$ time preprocessing.

The distance between nodes $a$ and $b$ equals the length of the path from $a$ to $b$. It turns out that the problem of calculating the distance between nodes reduces to finding their lowest common ancestor.

First, we root the tree arbitrarily. After this, the distance of nodes $a$ and $b$ can be calculated using the formula

\[ \texttt{depth}(a)+\texttt{depth}(b)-2 \cdot \texttt{depth}(c) \]

where $c$ is the lowest common ancestor of $a$ and $b$ and depth(s) denotes the depth of node $s$. For example, consider the distance of nodes 5 and 8:

The lowest common ancestor of nodes 5 and 8 is node 2. The depths of the nodes are depth(5)=3, depth(8)=4 and depth(2)=2, so the distance between nodes 5 and 8 is $3+4-2\cdot2=3$.


1 This lowest common ancestor algorithm was presented in [7]. This technique is sometimes called the Euler tour technique [66].