# Random Node

You are implementing a binary tree class from scratch which, in addition to insert, find, and delete, has a method getRandomNode() which returns a random node from the tree. All nodes should be equally likely to be chosen. Design and implement an algorithm for getRandomNode, and explain how you would implement the rest of the methods.

## Link here to the repo to solve the problem

๐## ๐ Tips

Be very careful in this problem to ensure that each node is equally likely and that your solution doesn't slow down the speed of standard binary search tree algorithms (like insert, find, and delete). Also, remember that even if you assume that it's a balanced binary search tree, this doesn't mean that the tree is full/complete/perfect.

This is your own binary search tree class, so you can maintain any information about the tree structure or nodes that you'd like (provided it doesn't have other negative implications, like making insert much slower). In fact, there's probably a reason the interview question specified that it was your own class. You probably need to store some additional information in order to implement this efficiently.

As a naive "brute force" algorithm, can you use a tree traversal algorithm to implement this algorithm? What is the runtime of this?

Alternatively, you could pick a random depth to traverse to and then randomly traverse, stopping when you get to that depth. Think this through, though. Does this work?

Picking a random depth won't help us much. First, there's more nodes at lower depths than higher depths. Second, even if we re-balanced these probabilities, we could hit a "dead end" where we meant to pick a node at depth 5 but hit a leaf at depth 3. Re-balancing the probabilities is an interesting, though.

A naive approach that many people come up with is to pick a random number between 1 and 3. If it's 1, return the current node. If it's 2, branch left. If it's 3, branch right. This solution doesn't work. Why not? Is there a way you can adjust it to make it work?

The reason that the earlier solution (picking a random number between 1 and 3) doesn't work is that the probabilities for the nodes won't be equal. For example, the root will be returned with probability 1/3, even if there are 50+ nodes in the tree. Clearly, not all the nodes have probability 1/3, so these nodes won't have equal probability. We can resolve this one issue by picking a random number between 1 and size_of_tree instead. This only resolves the issue for the root, though.What about the rest of the nodes?

The issue with the earlier solution is that there could be more nodes on one side of a node than the other. So, we need to weight the probability of going left and right based on the number of nodes on each side. How does this work, exactly? How can we know the number of nodes?

## ๐ Solution 1

There is a relatively easy solution, you might have guessed it yourself rather quickly. Let's consider a few solutions prior constructing a fast and efficient solution. Solution 1: We can traverse the tree in any order, add the nodes to a list and then get a random Node in the list. This solution would take O(N) time and space, which is really not optimal and we could do much better.

```
public class RandomNodeTree {
BinaryTreeNode head;
public RandomNodeTree(BinaryTreeNode head) {
this.head = head;
}
public BinaryTreeNode addNodesToList() {
List<BinaryTreeNode> nodes = new ArrayList();
addNodesToList(head, nodes);
int randomN = generateRandomNumber(0, nodes.size());
return nodes.get(randomN);
}
private void addNodesToList(BinaryTreeNode n, List<BinaryTreeNode> nodes) {
if (n == null) return;
nodes.add(n);
addNodesToList(n.left, nodes);
addNodesToList(n.right, nodes);
}
private int generateRandomNumber(int min, int max) {
Random r = new Random();
return r.nextInt((max - min) - 1) + min;
}
}
```

## ๐ Solution 2

We could label all the nodes with with an index from 1 to N. Then in a binary search we could just generate a random number in that range and traverse to that node and return it.

This is not a bad idea at all, but bear in mind that in this tree we have to also be able to insert, delete and search efficiently. Meaning that when we delete we would have to update all the nodes, which would take O(N).

## ๐ Solution 3

We could try just a simple approach: traverse randomly down the tree. At each node:

- With 1/3 odds, we return the current node.
- With 1/3 odds, we traverse left.
- With 1/3 odds, we traverse right.

This solution, like some of the others, does not distribute the probabilities evenly across the nodes.The root has a 1/3 probability of being selected- the same as all the nodes in the left put together.

## ๐ Solution 4 (fast & working)

Let's see if we could possibly make solution 3 work, 3 fails because the probabilities are not evenly distributed.

We can start with the root. With what probability should we return the root? Since we have N nodes, we must return the root node with 1/N probability. (In fact, we must return each node with 1/N probability. After all, we have N nodes and each must have equal probability. The total must be 1 (100%), therefore each must have 1/N probability.)

We've resolved the issue with the root. Now what about the rest of the problem? With what probability should we traverse left versus right? It's not 50/50. Even in a balanced tree, the number of nodes on each side might not be equal. If we have more nodes on the left than the right, then we need to go left more often .

One way to think about it is that the odds of picking something - anything - from the left must be the sum of each individual probability. Since each node must have probability 1/N, the odds of picking something from the left must have probability LEFT_SIZE * 1/N. This should therefore be the odds of going left.

Likewise, the odds of going right should be RIGHT_SIZE * 1/N.

This means that each node must know the size of the nodes on the left and the size of the nodes on the right. Fortunately, our interviewer has told us that we're building this tree class from scratch. It's easy to keep track of this size information on inserts and deletes. We can just store a size variable in each node. Increment size on inserts and decrement it on deletes.

```
public class RandomNodeTree {
private int data;
public RandomNodeTree left;
public RandomNodeTree right;
private int size;
public RandomNodeTree(int data) {
this.data = data;
this.size = 1;
}
public RandomNodeTree getRandomNode() {
int leftSize = left == null ? 0 : left.size;
System.out.println(leftSize);
Random random = new Random();
int index = random.nextInt(leftSize);
if (index < leftSize) return left.getRandomNode();
else if (index == leftSize) return this;
else return right.getRandomNode();
}
public void insertInOrder(int d) {
if (d <= data) {
if (left == null) left = new RandomNodeTree(d);
else left.insertInOrder(d);
} else {
if (right == null) right = new RandomNodeTree(d);
else right.insertInOrder(d);
}
size++;
}
public RandomNodeTree find(int d) {
if (d == data) return this;
else if (d < data) return left != null ? left.find(d) : null;
else if (d > data) return right != null ? right.find(d) : null;
return null;
}
}
```

*Question borrowed from โCracking the coding interviewโ*