Boosting Vs. Bagging — how they work (fun way)
Explained boosting and bagging algorithm in simple layman terms and explained their differences in plain English
Bagging and boosting are two commonly used techniques today in machine learning. While both of them are intended to improve the performance of a machine learning model, operationally they are very different and they have their own pros and cons.
The question that we are trying to address here is: how can we explain both the techniques, bagging and boosting in layman’s term. I will try to keep the explanation simple and easy without much focus on technical terms and details.
Before explaining the techniques, let us consider a scenario:
In an organization, there were two departments, there are 100 employees in each department, all of whom are ICs (individual contributors). There is a project manager for each of these 100 employees and each department — the two managers have two different management styles. (Don’t ask me how is it relevant, but for the sake of argument, let us consider this problem)
Each of these managers were assigned a task — build a prediction algorithm to find out which customer is going to churn in the next 1 month. Without delay, the manager jumped into leveraging their employees to solve the problem. Below is how they approached to solve the problem:
Manager 1 is a strong believer of democratic process. He decided to assign the problem to one and all of the 100 employees in his department. He asked the employees to randomly sample data from the full dataset and develop a model and come up with a predicted score and verdict.
Once all the employees come back with their verdicts for the customers, the manager decides to collate the verdicts from all employees. To generate a single verdict for each customer, he took the maximum vote by the employees — e.g. if 70 employees generated a positive verdict, then the final verdict generated will be positive.
This approach, where verdicts are generated in a democratic way, through multiple models (same algorithm but different dataset) is called “Bagging” approach.
Manager 2, on the other hand, believes that there is always a scope of improvement and every time a new employee looks at the model, new insights and value get added. So he asked his employees to work in sequence: first a randomly picked employee with generate a candidate model. Once the model is built, it will be passed to the next employee (randomly picked excluding the first employee). But while developing the model, the second employee will try to reduce the errors committed by employee 1.
To do this, employee 2 will take all the errors committed by employee 1 but will reduce the size of correct predictions by a small percentage. This small percentage, by which the correctly predicted sample size is reduced, is called “Learning Rate” of the boosting algorithm. He then build another model and pass it on to the next employee.
This process continues till all 100 employees take a pass at the model and improve it to some extent. At the end, the manager, considers the verdict generated by the 100th employee as the final one.
This approach, where the error get boosted at each step and the model is made to learn how to do better job, is called “Boosting” algorithm.
Difference between the bagging and boosting algorithm
As evident from the above two examples, there are some basic differences between these two algorithms. Let us list down these differences:
- Bagging can operate in parallel: In the above example no one employee is dependent on other for generating their own candidate model.
Boosting works in sequence: in the above example, an employee has to wait for the previous employee to finish his job and then he can leverage the model to find out errors and perform his steps. Hence this type of algorithm takes longer to solve.
- Bagging do not leverage the information of other employees. As a result the performance of bagging algorithm is lower a compared to Boosting algorithm with put a lot a focus on error part and try to reduce it.
- Bagging is considered stable solution due to the democratic approach it takes. Boosting, on the other hand, often suffers from overfitting problem. Selection of learning rate is important in Boosting algorithm to attain the global optimum.
Both Bagging and Boosting algorithm have their own pros and cons. One have to be mindful of selecting the right algorithm pertaining to the problem at hand.
That will be all on this topic. Hope you now have a clear idea on what a Bagging algorithm is vis-a-vis what a Boosting algorithm is.
If you like it, do follow me for more such posts ☺.