Topological sort is an algorithm I wish I had programmed in my brain. So many tasks to do every day, with so much dependencies among them. It would've definitely improved my productivity without being dependent on any external tool.
The example of chicken Biryani is chosen just for making it sound fun. I have tried to generalize the steps and removed details. Steps were chosen from the first link Google gave me. There are multiple variations of this recipe. If you don’t know what Biryani is, visit here. If you ever decide to cook some Biryani, this is the post you shouldn't be referring to. I guess you know what happens when we start development with vague requirements, right?
Alright then, let's start cooking some chicken Biryani.
- Topological sort is an algorithm used for the ordering of vertices in a graph. It outputs linear ordering of vertices based on their dependencies. We represent dependencies as edges of the graph.
- It works only on Directed Acyclic Graphs(DAGs) - Graphs that have edges indicating direction. The vertices have one-way relationship among them. Also, there shouldn’t be any cyclic dependency among the vertices.
- Some common applications of this algorithm in the world of computer science include-
- Scheduling jobs
- Instruction scheduling
- Deciding compilation sequence through make files.
Suppose we were making Biryani and we wanted someone to tell us the linear ordering of steps in its recipe i.e. a sequence of steps that can be followed by taking one step at a time. Steps in making Biryani have interdependence among them, just like any other recipe. The generalized process of cooking chicken Biryani is outlined below-
- Buy all the required ingredients.
- Soak some rice.
- Marinate chicken.
- Prepare layering, i.e. fried Onions, mint leaves, etc.
- Cook the rice.
- Cook chicken.
- Add layering on chicken.
- Put the rice on top of it.
This problem sounds so trivial, but real-world applications of this algorithm can include hundreds of steps(vertices) and much more complex dependencies. For the purpose of this post, we will keep it short and easy to understand.
The algorithm works in the below steps:
- Find out a vertex which doesn’t have any dependency, i.e. the first task that can be performed.
- Remove the vertex from the graph, removing the dependencies(edges) that other vertices had on this task. Add this vertex to result list.
- Repeat above two steps for rest of the graph.
Pseudocode can be written as:
RS <- List to hold final ordered vertices PQ <- A queue to hold vertices chosen for processing after they are removed from the graph while PQ is not empty: -Current <- Dequeue a vertex from PQ -Add Current to RS for every vertex n dependent on Current: -Remove edge Current -> n -if n has no other incoming edge, enqueue it to PQ if graph still has edges then return error (graph is not a DAG) else return RS
This uses Breadth First Search approach. There is also Depth First Search variation of this algorithm. Basically, making sure that step 'y' that is dependent on step 'x' is performed after step 'x', is the main goal of this algorithm.
Alright then, our Biryani making problem can be solved by algorithm above. Let's put the steps in graph form:
- We will start with a vertex that is not dependent on anything else. In our case, it is 'Buy Ingredients(BI)'. We will add this to processing queue and start the loop.
- We will remove first vertex from the processing queue. Currently only 'BI' is present in our queue. We will add it to the result list.
- We will remove the edges BI->SRC, BI->PL and BI->MCH from the graph. Here, what 'removing the edge' actually means depends on your implementation. In my implementation(linked below), I have maintained an in-degree counter at each vertex 'x' to track the number of vertices 'x' is dependent on. Also, I have maintained an adjacency list at 'x' to indicate the next steps to be taken from it. Each time a vertex 'v' is added to the result list, in-degrees of vertices that were dependent on 'v' will be decremented by one.
- For all three vertices(SRC, PL, MCH), BI was the only dependency. Now that BI is added to result, these three vertices will be added to processing queue.
- Above steps will be performed until processing queue is empty.
The implementation of this problem can be found on my repository here.
The core of the topological sort algorithm is in the gist file here.
Some facts about topological sort:
- It only works on directed acyclic graph. If a graph has cycles, it can't have valid topological sort.
- A graph can have more than one valid topological sort. In above example, if we were to process vertex MCH(Marinate Chicken) before vertex PL(Prepare layering), ordering would've been a bit different(and still valid).
- Every DAG has at-least one valid topological sort.
- Time complexity of this algorithm is O(|V| + |E|). |V| being number of vertices in graph and |E| being number of edges. We are visiting every vertex once and additional |E| time for calculating in-degrees.
Feel free to ping me if you have any doubts. Also, do try Biryani at least once!