Trees
Amongst the hardest problems you will probably find as an engineer will be trees and graphs without a doubt. I'll give you a few reasons on why this tends to be much harder:
- Searching and updating in a tree is harder than in a linearly organised data structure.
- The worst case (Big O) and average case time can vary wildy.
- A recursive approach is often used, in case you are like most programmers who implement iterative approaches, it might get tricky.
Let's first have a look at the different types of trees
🌳🌲🌴 Types of trees
A tree, similarly to a Linked List, is composed of nodes, and it starts with a root node.
Each node has zero or more children, and each child node has zero or more children and so on.
It is important to notice that trees can't be circular, meaning they can't link back to any of its parent nodes.
Trees vs Binary Trees
One of the most common trees you will encounter is a binary tree. All this means is that each node has exactly 0,1 or two children maximum.
Lets have a look at a Ternary Tree (one node has 3 children):
There could be occasions where you want to represent a tree with 10 nodes for instance, these would be called 10-ary trees, meaning each node can end up having to 10 children.
Binary Tree vs Binary Search Tree
We have a binary search tree were every node fits a specific ordering pattern. Starting from the root node, all nodes smaller than it will go to the left, and the children nodes of the root will do as well - recursively. All the nodes bigger than the root will go to the right.
Is worth pointing out that people might have slight different definitions of a binary search tree. For some people this might not include duplicates, and if they do they might be on the right side, or left. Is worth clarifying whenever you are implementing this data structure.
Balanced vs Unbalanced
A lot of people tend to think that balanced means that they are of the same size in every possible side. But, when someone refers to a tree as balanced - what they want you to think about if is your insert and find algorithms will roughly run in O(Log N)?
Complete Binary Trees
These are trees that are completely filled, except perhaps for the level at the bottom. To the extent that the last level is filled, if is filled from left to right.
Not Completed Binary Tree:
Complete Binary Tree:
Full Binary Trees
A tree is considered full when every single node has either zero or two children, but never just one child.
Perfect Binary Trees
A perfect binary tree is one that is both full and complete. All leaf nodes will be at the same level, and this level has the maximum number of nodes.
Binary Tree Traversal
There are two general strategies to traverse a tree:
Depth First Search (DFS)
In this strategy, we adopt the depth as the priority, so that one would start from a root and reach all the way down to certain leaf, and then back to root to reach another branch.
The DFS strategy can further be distinguished as preorder, inorder, and postorder depending on the relative order among the root node, left node and right node.
Breadth First Search (BFS)
We scan through the tree level by level, following the order of height, from top to bottom. The nodes on higher level would be visited before the ones with lower levels.
On the following figure the nodes are enumerated in the order you visit them, please follow 1-2-3-4-5 to compare different strategies.
Tree Node Code
public class BinaryTreeNode {
int data;
BinaryTreeNode left;
BinaryTreeNode right;
public BinaryTreeNode(int d) {
this.data = d;
}
}
In-Order Traversal
This means to visit the left branch, then the current node and finally the right branch. When performed over a binary tree it visits the nodes in an ascending order.
public static void inOrderTraversal(BinaryTreeNode node) {
if (node != null) {
inOrderTraversal(node.left);
visit(node);
inOrderTraversal(node.right);
}
}
private static void visit(BinaryTreeNode node) {
System.out.println("visiting node: " + node.data);
}
Pre-Order Traversal
This means that we visit the current node prior visiting the children nodes.
public static void preOrderTraversal(BinaryTreeNode node) {
if (node != null) {
visit(node);
inOrderTraversal(node.left);
inOrderTraversal(node.right);
}
}
Post-Order Traversal
This means that we visit the children nodes prior visiting the current one.
public static void postOrderTraversal(BinaryTreeNode node) {
if (node != null) {
inOrderTraversal(node.left);
inOrderTraversal(node.right);
visit(node);
}
}
Complexity Analysis: Recursive vs Iterative
As we mentioned before, we can traverse a tree recursively to retrieve all the data in pre-order, in-order or post-order. The time complexity is O(N) because we visit each node exactly once. And the depth of the tree might be N in the worst case. That is to say, the level of recursion might be at most N in the worst case. Therefore, taking system stack into consideration, the space complexity is O(N) as well.
To be cautious, the complexity might be different due to a different implementation. It is comparatively easy to do traversal recursively but when the depth of the tree is too large, we might suffer from stack overflow problem. That's one of the main reasons why we want to solve this problem iteratively sometimes.
For the iterative solution, the time complexity is apparently the same with recursion solution which is O(N). The space complexity is also O(N) since in the worst case, we will have all the nodes in the stack. There are some other solutions for iterative traversal which can reduce the space complexity to O(1).
Iterative Solution
There are several iterative solutions for tree traversal. One of the solutions is to use a stack to simulate the recursion process.
Taking pre-order traversal as an example, in each iteration, we pop one node from the stack and visit this node. Then if this node has a right child, push its right child into the stack. If this node has a left child, push its left child into the stack. It is noteworthy that we push the right child first so that we can visit the left child first since the nature of the stack is LIFO(last in first out). After that, we can continue to the next iteration until the stack is empty.
public List<Integer> preorderTraversal(TreeNode root) {
List<Integer> answer = new ArrayList<>();
Stack<TreeNode> s = new Stack<TreeNode>();
if (root != null) {
s.push(root);
}
TreeNode cur;
while (!s.empty()) {
cur = s.pop();
answer.add(cur.val); // visit the root
if (cur.right != null) {
s.push(cur.right); // push right child to stack if it is not null
}
if (cur.left != null) {
s.push(cur.left); // push left child to stack if it is not null
}
}
return answer;
}
Solving Tree Problems Recursively
As we know, a tree can be defined recursively as a node(the root node) that includes a value and a list of references to children nodes. Recursion is one of the natural features of a tree. Therefore, many tree problems can be solved recursively. For each recursive function call, we only focus on the problem for the current node and call the function recursively to solve its children.
Typically, we can solve a tree problem recursively using a top-down approach or using a bottom-up approach.
"Top-down" Solution
Top-down" means that in each recursive call, we will visit the node first to come up with some values, and pass these values to its children when calling the function recursively. So the "top-down" solution can be considered as a kind of preorder traversal. To be specific, the recursive function top_down(root, params) works like this:
1. return specific value for null node
2. update the answer if needed // answer <-- params
3. left_ans = top_down(root.left, left_params) // left_params <-- root.val, params
4. right_ans = top_down(root.right, right_params) // right_params <-- root.val, params
5. return the answer if needed // answer <-- left_ans, right_ans
For instance, consider this problem: given a binary tree, find its maximum depth.
We know that the depth of the root node is 1. For each node, if we know its depth, we will know the depth of its children. Therefore, if we pass the depth of the node as a parameter when calling the function recursively, all the nodes will know their depth. And for leaf nodes, we can use the depth to update the final answer. Here is the pseudocode for the recursive function maximum_depth(root, depth):
1. return if root is null
2. if root is a leaf node:
3. answer = max(answer, depth) // update the answer if needed
4. maximum_depth(root.left, depth + 1) // call the function recursively for left child
5. maximum_depth(root.right, depth + 1) // call the function recursively for right child
Here is an example to help you understand how it works:
"Bottom-up" Solution
"Bottom-up" is another recursive solution. In each recursive call, we will firstly call the function recursively for all the children nodes and then come up with the answer according to the returned values and the value of the current node itself. This process can be regarded as a kind of postorder traversal. Typically, a "bottom-up" recursive function bottom_up(root) will be something like this:
1. return specific value for null node
2. left_ans = bottom_up(root.left) // call function recursively for left child
3. right_ans = bottom_up(root.right) // call function recursively for right child
4. return answers // answer <-- left_ans, right_ans, root.val
Let's go on discussing the question about maximum depth but using a different way of thinking: for a single node of the tree, what will be the maximum depth x of the subtree rooted at itself?
If we know the maximum depth l of the subtree rooted at its left child and the maximum depth r of the subtree rooted at its right child, can we answer the previous question? Of course yes, we can choose the maximum between them and add 1 to get the maximum depth of the subtree rooted at the current node. That is x = max(l, r) + 1.
It means that for each node, we can get the answer after solving the problem for its children. Therefore, we can solve this problem using a "bottom-up" solution. Here is the pseudocode for the recursive function maximum_depth(root):
1. return 0 if root is null // return 0 for null node
2. left_depth = maximum_depth(root.left)
3. right_depth = maximum_depth(root.right)
4. return max(left_depth, right_depth) + 1 // return depth of the subtree rooted at root
Here is an example to help you understand how it works:
Conclusion
It is not easy to understand recursion and find out a recursive solution for the problem. It needs practice.
When you meet a tree problem, ask yourself two questions: Can you determine some parameters to help the node know its answer? Can you use these parameters and the value of the node itself to determine what should be the parameters passed to its children? If the answers are both yes, try to solve this problem using a "top-down" recursive solution.
Or, you can think of the problem in this way: for a node in a tree, if you know the answer of its children, can you calculate the answer of that node? If the answer is yes, solving the problem recursively using a bottom up approach might be a good idea.