Graphs
What is a graph? 🤠
A Graph is a collection of nodes, similar to a Tree, and actually a Tree is a type of graph, but not all graphs are Trees.
When the nodes in a graph are connected we call them edges, and there are two types of connections between nodes. Directed which is like a one way street (unidirectional) and Undirected edges which are like a two-way street.
Is also important to note that graphs don't have to be interconnected, they might also consists of multiple isolated subgraphs. When there is a path between every pair of vertices we call it a connected graph.
Let's have a look at a graphical representation of a graph:
But, how do we represent this in a programme? Let me show you 2 common ways.
Adjacency lists
In this representation, for every node we have we store a list of its adjacent nodes, let me show you an example:
If we have the classes for the graph and node:
class Graph {
public Node[] nodes;
}
Class Node {
public int data;
public Node[] children;
}
Remember that unlike a Tree, in a graph we might be unable to reach all the nodes from the "root", hence why we need a parent Graph class.
In the case of the graph that I drew in the example it would be represented like this:
Node | Adjacency list |
---|---|
0 | 1 |
1 | 2 |
2 | 0, 3 |
3 | 2 |
4 | 6 |
5 | 4 |
6 | 5 |
Adjacency Matrices
This is a Matrix which is NxN (N is the number of nodes). It can be filled with 0/1 or true/false values indicating true whenever there is an edge from node I to node J.
Imagine the graph:
The Matrix would be represented as:
- | 0 | 1 | 2 | 3 |
---|---|---|---|---|
0 | 0 | 1 | 0 | 0 |
1 | 0 | 0 | 1 | 0 |
2 | 1 | 0 | 0 | 0 |
3 | 0 | 0 | 1 | 0 |
The same graph algorithms that are used on adjacency lists (breadth-first search, etc.) can be performed with adjacency matrices, but they may be somewhat less efficient. In the adjacency list representation, you can easily iterate through the neighbors of a node. In the adjacency matrix representation, you will need to iterate through all the nodes to identify a node's neighbors.
Graph Search
There are 2 common approaches when it comes to searching a graph.
Let's compare them with the same graph:
Depth-First Search (DFS)
Here we start at an arbitrarily selected Node, then we explore each branch completely before moving onto the next branch. We go deep first prior going wide.
DFS is the preferred search if we want to visit every single node in the graph. Bear in mind both can do it but DFS is a bit simpler.
Let's look at some PseudoCode implementation for this:
void search(Node Root) {
if (root == null) return;
visit(root);
root.visited = true;
for each (Node n in root.adjacent) {
if (n.visited == false) search(n);
}
}
As previously stated, in DFS we visit a node (N1), then we iterate through each of N1 neighbours. When visiting a node N2 which is a neighbour of N1, we visit all of N2s neighbours before going on to N1s other neighbours. We do this exhaustively until we've searched all the neighbours.
Search order for example graph: 0, 1, 3, 2, 4, 5
Note that pre-order and other forms of tree traversal are a form of DFS. The key difference is that when implementing this algorithm for a graph, we must check if the node has been visited. If we don't, we risk getting stuck in an infinite loop.
Breadth-First Search (BFS)
Same as before, we arbitrarily select a Node, but here we explore each neighbour first, before going to any children. That is, we go wide prior going deep.
BFS is generally better if we want to find the shortest path between any nodes.
Let's look at some PseudoCode implementation for this:
void search(Node root) {
Queue queue = new Queue();
root.marked = true;
queue.enqueue(root); // add to the end of the Q
while(!queue.isEmpty()) {
Node r = queue.dequeue(); // remove from the front of the Q
visit(r);
foreach (Node n in r.adjacent) {
if (n.marked == false) {
n.marked = true;
queue.enqueue(n);
}
}
}
}
As previously stated, in BFS a node visits each of N1 neighbours prior moving onto the neighbours of each neighbour. Try not to make the mistake of trying to implement this recursively, an iterative approach using a queue tends to be much easier.
Search order for example graph: 0, 1, 4, 5, 3, 2
Quick Example Comparison
Imagine representing all the friendships in the entire world in a graph. If we wanted to find a path of friendship between Diego and Chris (who are already friends).
In DFS, we could take a path like this: Diego -> Carlos -> John -> Antonio -> Jair -> Jake -> Arianna -> Luis -> etc. Then we would find ourselves very far away. We may in fact go through most of the world without realising that, in fact Diego and Chris are already friends. It also wouldn't find us the shortest path.
In BFS, we would stay close to Diego for as long as possible, we might iterate through Diego's friends, but we wouldn't go to his more distant connections until it becomes necessary. If Chris is the friend of a friend of Diego, we would find him relatively quickly as well.