The Pareto Principle
The Pareto Principle was named after an Italian economist who discovered that 80% of the wealth in Europe was concentrated in 20% of the population. This discovery is also known as the 80/20 rule or the Law of the Vital Few. The Pareto Principle is a phenomenon that finds application in many places, such as software engineering, quality, manufacturing, word-of-mouth marketing, human resources, and government; the 80/20 rule was also popularized by the book “The Mckinsey Way” as a principle by which Mckinsey consultants follow to solve client problems.
A result of the Pareto Principle is the Pareto Chart. The Pareto Chart is used to graphically summarize and display the relative importance of the differences between groups of data, or to visually represent the Vital Few versus the Trivial Many.
You can create a Pareto Chart in any statistical tool such as GNU Plot, MatLab, Mathematica, Stata, SAS, or MiniTab. For our example, I’ll show how to create a Pareto Chart in Excel.
To create a Pareto Chart in Excel, setup a spreadsheet such that the first column contains the categories of your data, the second columns, percentages of each category, and the third column cumulative percentages of each category. The example below represents data collected on the number of citations issued by police for various violations (Category) in a particular intersection. Overall, 300 tickets were issued for various traffic violations. The percentage of tickets issued for each violation and the cumulative percentage of tickets issued were calculated and entered into an Excel spreadsheet. If the data is not sorted from high to low, then sort the data from the highest percentage to the lowest. The sequence for sorting is “Data”, “Sort”, “Percent”, “Descending”.

Then, highlight “Categories”, “Percent”, and “Cumulative Percent” Columns. Go to the Chart Wizard, select “Custom Types” and select “Line-Columns”.

In my line of work, I use the Pareto Principle all the time. As an example, at Amazon, I used the Pareto Principle to seperate the value-added time in a process versus the non-value added time. This approach would help direct efforts on where to spend our time to improve a process. The Pareto Chart is a nice and easy way to visually display the data and direct efforts.
In software engineering, to avoid feature creep and to remain focused on the right features, it’s important to use the Pareto Principle to determine which 20% of the features will satisfy at least 80% of the users. In my experience, not enough software organizations use this principle in their software development.
In Human Resources, roughly 20% of the people produce 80% of the results.
In Government, roughly 20% of the group are influential and the other 80% are peripheral.
In word-of-mouth marketing or marketing in general, roughly 20% are the influential, “sneezer” types and the remaining 80% are the slow adopters or followers.
In Project Management, if a schedule slips or a milestone is not met, it must be measured against the 80/20 rule and then quickly recover the project and re-focus on the vital 20%; or, make sure that the milestone kept are part of the vital 20% and if a milestone under the trivial 80% slips, then the team can afford to do that and not freak out that the project will fail.
To apply the Pareto Principle more broadly, as a general rule, it’s important to remember that only 20% of our daily activities matter. Or, put another way, let us focus our energies and time on the 20% that will make an impact on the other 80%.
WP Cumulus Flash tag cloud by Roy Tanck and Luke Morton requires Flash Player 9 or better.
If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.
Comments
Thanks for reading.
A Pareto Chart doesn’t aim to make conjectures. It is simply a tool to show the seperation in a data set. To make a conjecture, the next step after a Pareto Chart is called Hypothesis testing. In this step, you can make hypotheses about the data and see if the data supports that hypotheses. The tools used for this step are T-Tests, Regression in all its flavors, and Chi-Square, and others also.
Also, the 80/20 Principle is not exactly a 80/20 split. It would be nice, but outside of the ivory tower of theories, the real world is not split so cleanly. The 80/20 rule is rough, but the spirit of it is that there are vital few, but trivial many.
Again, thanks for reading.
Pete
[...] When I was with Amazon, I led an industrial engineering project where I conducted a time study on a critical path process. This time study revealed several of the wastes mentioned above and are stratified in the Pareto Chart below: [...]
[...] When I left Amazon.com in 2004, at the time the there were more BMVD’s than Hardlines, but Hardlines represented a higher percentage of sales and recognized revenue. In fact, there was a nice seperation of data, such that an 80/20 was not met, but it’s still a nice looking Pareto. If Anderson had a more reliable data source, perhaps he could have looked at the Amazon business this way, instead of just books. Looking at one category — books — provides an incomplete view of the Amazon business; it’s not a reliable sample and hence the results from that analysis are insufficient to use to extrapolate or support the long tail thesis. [...]
[...] With site and feature development, the 80/20 rule holds. The spec should meet that 80%, and what gets cut should address the other 20%. When you have to cut features to make launch, remember that development does not end with launch. Keep a list of what gets cut and prioritize that list to be developed later. [...]
Leave a comment
Additional comments powered by BackType


Interesting read.
I had one question though. You didn’t conjecture as to what the data for the traffic tickets represented. In this example was the idea to say that the ticket with 20% weight causes 80% of the traffic problems? In the example you gave for tickets, it appears that Left Turn and Speeding were about equally weighted, but their combined percentage is 40%, not 20%.
I would be interested to know how the data was to be interpreted.