Tuesday 18 May 2010

Algorithms – Depth/Breadth First Search

We have seen some of the key concepts to Graphs; What a node is, an edge – as well as definitions for Digraphs and Undirected Graphs – and other bits ‘n’ bobs :) .

But one question stills looms; How do we traverse a Graph?

Here we shall look at two of the key traversing algorithms for Graphs:

  • Depth First Search (DFS), and
  • Breadth First Search (BFS).

Before i continue any further, i would like to point something out that has bugged me all day today!

The main difference between a DFS and BFS is, yes the fact that DFS goes down and BFS goes across; but my question to you is this. WHY! Why, does a DFS goes down a Graph and a BFS go across (along) a Graph?

Don't cheat. Read through this post and see if you get the answer correct, if you do, then there’s a cookie in it for you! :D

 

Depth First Search (DFS)

For now, let’s say we want to apply the DFS to an undirected graph. Since in an undirected graph there is no notion of direction, then we could end up backtracking onto nodes that we have already checked!

 

Explanation of DFS (Pseudo)

The  simple description of the algorithm for a DFS is this:

  1. Start at some node ‘i’. This is now our current node.
  2. State that our current node is ‘visited’.
  3. Now look at all nodes adjacent to our current node.
  4. If we see an adjacent node that has not been ‘visited’, add it to the stack.
  5. Then pop of the top node on the stack and traverse to it.
  6. And go back to step 1.

This algorithm terminates when all nodes have been visited, aka, the stack is empty. As you may be thinking, what’s the best way to get the adjacent nodes for this search. Well if you said Adjacency List then give yourself a pat on the back :D

 

Example

Consider the following undirected Graph:

DFS_1 

Later i will post a page that shows the code for how a DFS/BFS works, then this part may hopefully make more sense :)

But for now, let’s explain how we traverse this with a DFS.

 

  1. Let’s say we start at the node A. This is now our current node and we state it as ‘visited’.
  2. Now we add all of the nodes in A’s adjacency list to a stack, and declare them a visited, so we have a stack that looks like the following:
    • [B, E, C]   (with C as the top of the stack)
  3. Since a stack is Last In First Out (LIFO), then we pop of the node C from the stack and visit node C. Since node C has no nodes in its adjacency list, then we can traverse to the next node at the top of the stack – E.
  4. Now E has nodes A and B in its adjacency list. However, these nodes have already been declared as visited, therefore we cannot add them to the stack – too bad, hehe. This leaves one node left in the stack – B!
  5. Now at node B, we see that we can finally add a node to the stack, node D! (not forgetting to declare node D as visited now.)
  6. Node D has no nodes in its Adjacency List that we can add to the stack that haven’t already been visited, therefore, since the stack is empty, we have finishing traversing the graph.
  7. The final output, possible, is the order in which we visited the nodes. which is:
    A –> C –> E –> B –> D

I will try and create a GIF of this =/

 

Final Notes on DFS

Some final notes on DFS is that DFS is best used to answer connectivity questions. Such as determining if every pair of nodes in a graph is connected. The time complexity of DFS is O(n+m), where n is the total number of nodes, and m is the total number of edges.

 

 

Breadth First Search (BFS)

For now, let’s say we want to apply the BFS to an undirected graph. Since in an undirected graph there is no notion of direction, then we could end up backtracking onto nodes that we have already checked!

 

Explanation of BFS (Pseudo)

The  simple description of the algorithm for a BFS is this:

  1. Start at some node ‘i’. This is now our current node.
  2. State that our current node is ‘visited’.
  3. Now look at all nodes adjacent to our current node.
  4. If we see an adjacent node that has not been ‘visited’, add it to the queue.
  5. Then pull out the first node on from the queue and traverse to it.
  6. And go back to step 1.

As you may have noticed so far, this algorithm is very similar to that of DFS, however it does produce very different results, and is better in some situations than the DFS.

 

Example

Okay, time for an example. An to simplify things, and to save space; let’s use the same graph above as we did for the DFS :D

 

  1. Let’s say we start at the node A. This is now our current node and we state it as ‘visited’.
  2. Now we add all of the nodes in A’s adjacency list to a queue, and declare them a visited, so we have a queue that looks like the following:
    • [B, E, C]   (with B as the head of the queue)
  3. Since a queue is First In First Out (FIFO), then we pull out the node B from the queue and visit node B. Now at B, we can see that we can traverse to the nodes D and E, however, node E has already been declared as ‘visited’ because of node A, and E already exists in our queue. Remember, we can’t visit a noted twice! Because of this, we can only add node D to our queue:
    • [E, C, D]
  4. Now we pull of node E which has nodes A and B in its adjacency list. However, these nodes have already been declared as visited, therefore we cannot add them to the queue – too bad, hehe. This leaves us to pull out the next node – C!
  5. Since node C has no nodes in its adjacency list, then we can traverse to the next node and final node at the front of the queue – D.
  6. Node D has no nodes in its Adjacency List that we can add to the queue that haven’t already been visited, therefore, since the queue is empty, we have finishing traversing the graph.
  7. The final output, possible, is the order in which we visited the nodes. which is:
    A –> B –> E –> C –> D
    Which is very different to the output from the DFS!

I will try and create a GIF of this =/

 

Final Notes on BFS

Some final notes on BFS is that a BFS is best used at finding the shortest paths in a given Graph. The time complexity of BFS is O(n+m), where n is the total number of nodes, and m is the total number of edges.

 

Both Searches

Both of the above searches are best used for a few things, such as:

  • Testing whether or not a graph is connected.

ANSWER

Now for the part you’ve all been waiting for. Have you figured out why DFS traversing downwards and a BFS traversing along the graph?

Well, i’ll tell you any way. It’s because a DFS uses a Stack and a BFS uses a Queue to find out which node to traverse to next :)

1 comment:

  1. This is really nice. I liked the simplicity with which the algos are presented. No Strings attached. Helpful.

    ReplyDelete